Sensitivity of genes, molecular pathways and disease related categories to chemical exposures

Published: 8 July 2020| Version 1 | DOI: 10.17632/65fcympd2j.1
Contributors:
,
,
,
,

Description

The goal of this project is to identify molecular mechanisms and disease categories sensitive to chemical exposures in an unbiased way. Results of this project are published on preprints.org (doi: 10.20944/preprints202006.0261.v1). The data-files described below represent major steps of our analysis: 1. Annotated chemical-gene interactions.xlsx The data on chemical-gene interactions obtained from high-throughput toxicological genomic experiments with human, mouse, or rat cells and tissues was extracted from Comparative Toxicogenomic Database (CTD, http://ctdbase.org/) on 08.24.2018. Genes of olfactory receptors were removed from resulting database as these genes have different names in different mammalian species. Chemical compounds were annotated for major uses with information from Wikipedia, PubChem, and PubMed. Based on textual annotation every compound was assigned one to three annotation terms out of the following list: pharmaceutical, recreational drug, research, warfare, endobiotic, agricultural, cosmetics, environment, food components, industrial, and pollutant. All contributors annotated an equal numbers of chemicals, and AS checked every annotation to insure consistency of approaches. The resulting dataset includes 641,516 individual chemical-gene interactions. 2. Number of chemical-gene interactions per gene.xlsx The dataset created at the previous step was used to determine number of chemical-gene annotations for every gene, including total number as well as number of activating and suppressive chemical-gene annotations. We hypothesize, that number of chemical gene interactions can be used as a measure of the gene sensitivity to chemical exposures. 3. Enrichment of molecular pathways with genes sensitive to chemical exposures.xlsx The list of genes with the total number of chemical-gene interactions for every gene was used as an input for the Gene-Set Enrichment Analysis (GSEA, https://www.gsea-msigdb.org/gsea/index.jsp) against Hallmark, KEGG, and Reactome datasets, to identify molecular pathways highly enriched with genes sensitive to chemical exposures. We suggest, that normalized enrichment score (NES) for every enriched pathway is a measure of the pathway's sensitivity to chemical exposures. 4. Predicted adverse outcomes_KEGG.xlsx and Predicted adverse outcomes_Reactome.xlsx After identification of the most sensitive pathways (see previous paragraph) we inquired, what adverse outcomes may be associated with perturbation of these pathways. To do so we uploaded lists of the most sensitive KEGG and Reactome pathways (NES > 2.5, FDR < 0.1) to CTD and conducted an association analysis between pathway terms and disease terms. Many disease terms occurred more than one time in the output of this analysis as they were associated with multiple sensitive pathways. We suggest that the number of occurrences is and indicator of the susceptibility of the corresponding health condition to chemical exposures.

Files

Steps to reproduce

The data can be reproduced following the steps outlined in our data description

Institutions

University of Massachusetts Amherst

Categories

Molecular Biology, Public Health, Genomics, Systems Biology

Licence