Transcriptomic Dataset for Network based identification of key Master Regulators for Immunologic Constant of Rejection

Name: Transcriptomic Dataset for Network based identification of key Master Regulators for Immunologic Constant of Rejection
Creator: Raghvendra Mall
Published: 2021-02-01T09:38:28.801Z
Keywords: Transcriptomics, Expression Array

Mall, Raghvendra

doi:10.17632/d9ffb7kkzt.3

Transcriptomic Dataset for Network based identification of key Master Regulators for Immunologic Constant of Rejection

Published: 1 February 2021| Version 3 | DOI: 10.17632/d9ffb7kkzt.3

Contributor:

Raghvendra Mall

Description

The goal of this study was to identify molecular alterations governing mechanisms for the immune exclusion phenotype. We developed a network-based approach to identify key transcription factors (TF) associated with poor immunologic anti-tumor activity. Based on the Immunologic Constant of Rejection (ICR) signature, tumors may be classified as immune active (ICR High) or immune silent (ICR Low). We used The Cancer Genome Atlas (TCGA) RNA-seq data of 12 specific cancer types (2,307 samples, 3,674 TFs, and 23,216 target genes) to build gene regulatory networks, determine each TF’s regulon, followed by determination of activity matrix of TFs for all tumor samples, and finally run a fast gene-set enrichment analysis to identify the most important TFs, named Master Regulators (MR), that are unique to ICR Low and ICR High tumors respectively.

Files

Steps to reproduce

For this purpose, RNA-Seq data from The Cancer Genome Atlas (TCGA) website were downloaded and processed using the TCGA Assembler (v2.0.3). We have a total of p = 23,216 target genes including 3,674 TFs. The RNA-Seq data from 32 cancer types of tissue type Primary Solid Tumors (TP) were used in our analysis. In the case of melanoma (SKCM), metastasis samples (TM) was included for patients that had no available TP sample. Normalization was performed within lanes, to correct for gene-specific effects (including GC-content and gene length) and between lanes, to correct for sample-related differences (including sequence depth) using R package EDASeq (v2.12.0). These samples were then quantile normalized per cancer using the preprocessCore (v1.36.0) and are log2 transformed for further analysis. Here PRECOG folder in Data contains processed RNA-Seq dataset from the PRECOG repository for various cancer types and the two *.rds files need to be put in the "ICR_Analysis/Data/PRECOG/" folder of our github repository (https://github.com/raghvendra5688/ICR_Analysis).

Institutions

Qatar Foundation

Transcriptomic Dataset for Network based identification of key Master Regulators for Immunologic Constant of Rejection

Description

Files

Steps to reproduce

Institutions

Categories

Licence