Sensitivity of genes, molecular pathways and disease related categories to chemical exposures
The goal of this project is to identify molecular mechanisms sensitive to chemical exposures in an unbiased way. Results of this project are published on preprints.org (doi: 10.20944/preprints202006.0261.v1). The data-files described below represent major steps of our analysis: 1. Annotated chemical-gene interactions.xlsx The data on chemical-gene interactions obtained from high-throughput toxicological genomic experiments with human, mouse, or rat cells and tissues was extracted from Comparative Toxicogenomic Database (CTD, http://ctdbase.org/) on 08.24.2018. Genes not present in genomes of all three species were filtered out. Chemical compounds were annotated for major uses with information from Wikipedia, PubChem, and PubMed. Based on textual annotation every compound was assigned one to three annotation terms out of the following list: pharmaceutical, recreational drug, research, warfare, endobiotic, agricultural, cosmetics, environment, food components, industrial, and pollutant. All contributors annotated an equal numbers of chemicals, and AS checked every annotation to insure consistency of approaches. The resulting dataset includes 591,084 individual chemical-gene interactions. 2. Number of chemical-gene interactions per gene.xlsx The dataset created at the previous step was used to determine number of chemical-gene annotations for every gene, including total number as well as number of activating and suppressive chemical-gene annotations. We hypothesize, that number of chemical gene interactions can be used as a measure of the gene sensitivity to chemical exposures. 3. Enrichment of molecular pathways with genes sensitive to chemical exposures.xlsx The list of genes with the total number of chemical-gene interactions for every gene was used as an input for the Gene-Set Enrichment Analysis (GSEA, https://www.gsea-msigdb.org/gsea/index.jsp) against Hallmark, KEGG, and Reactome datasets, to identify molecular pathways highly enriched with genes sensitive to chemical exposures. We suggest, that normalized enrichment score (NES) for every enriched pathway is a measure of the pathway's sensitivity to chemical exposures.
Steps to reproduce
The data can be reproduced following the steps outlined in our data description