Data for: Coregulatory long non-coding RNA and protein coding genes in serum starved cells

Published: 04-12-2018| Version 1 | DOI: 10.17632/65j75yvx6w.1
Yu Liu,
Fan Wang,
Benjamin Soibam,
Jin Yang,
Rui Liang


Cell culture. Mouse embryonic fibroblasts (MEFs) were established from E12.5 embryos of C57B/L6 background. A standard procedure is used to isolate MEFs. MEFs were cultured in Dulbecco's Modified Eagle's Medium (DMEM) (Fisher Scientific) supplemented with 10% FBS, 100 U/mL penicillin, 100 μg/mL streptomycin and 2 mM L-Glutamine. Only passage 1-3 of the cultures were used for experiments. To starve MEFs, PBS was used to wash away residue FBS, then DMEM containing 0.5% FBS and all the other supplements were used to culture MEFs for 24hrs. The cells were next changed back to regular DMEM with 10% FBS for 12 hours. RNA-seq library preparation and sequencing Extracted RNA samples underwent quality control assessment using the RNA Pico 6000 chip on Bioanalyzer 2100 (Agilent) and were quantified with Qubit Fluorometer (Thermo Fisher). The RNA libraries were prepared and sequenced at University of Houston Seq-N-Edit Core. mRNA libraries were prepared with NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (New England Biolabs) using 1000 ng input RNA. The size selection for libraries were performed using SPRIselect beads (Beckman Coulter) and purity of the libraries was analyzed using the High Sensitivity DNA chip on Bioanalyzer 2100 (Agilent). The prepared libraries were pooled and sequenced using NextSeq 500 (Illumina), generating ~15 million 76 bp single-end reads per sample. RNA-seq data processing and determination of differentially expressed genes RNA-Seq reads were aligned to the mouse genome using tophat [57,58] with parameters: -p 8 --read-mismatches 2 --b2-L 20 -g 5. An annotated list of known transcripts (from Genecode) was used for final set of transcriptome. The biotype of the transcripts (protein-coding or lncRNA) was based on Genecode annotation. The abundance and number of raw fragments aligned to each gene were computed using cufflinks [57]. The abundance of each gene was expressed as FPKM (Fragments Per Kilo base of transcript per million mapped reads). To obtain differentially expressed genes between two conditions, we used DESeq [59]. The raw counts mapped to a transcript were obtained using htseq-count [60] and were used as inputs to the DESeq tool. To remove genes with low expression profiles, we included only the ones with total aligned fragment count across all the samples greater than 5. We used p-value of less than 0.05 and absolute log2(fold_change) > 1 as the criteria to determine differentially expressed genes between two conditions. Enriched GO terms were obtained using DAVID [61].