Data for: Coregulatory long non-coding RNA and protein coding genes in serum starved cells

Published: 4 Dec 2018 | Version 1 | DOI: 10.17632/65j75yvx6w.1

Description of this data

Cell culture. Mouse embryonic fibroblasts (MEFs) were established from E12.5 embryos of C57B/L6 background. A standard procedure is used to isolate MEFs. MEFs were cultured in Dulbecco's Modified Eagle's Medium (DMEM) (Fisher Scientific) supplemented with 10% FBS, 100 U/mL penicillin, 100 μg/mL streptomycin and 2 mM L-Glutamine. Only passage 1-3 of the cultures were used for experiments. To starve MEFs, PBS was used to wash away residue FBS, then DMEM containing 0.5% FBS and all the other supplements were used to culture MEFs for 24hrs. The cells were next changed back to regular DMEM with 10% FBS for 12 hours.
RNA-seq library preparation and sequencing
Extracted RNA samples underwent quality control assessment using the RNA Pico 6000 chip on Bioanalyzer 2100 (Agilent) and were quantified with Qubit Fluorometer (Thermo Fisher). The RNA libraries were prepared and sequenced at University of Houston Seq-N-Edit Core. mRNA libraries were prepared with NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (New England Biolabs) using 1000 ng input RNA. The size selection for libraries were performed using SPRIselect beads (Beckman Coulter) and purity of the libraries was analyzed using the High Sensitivity DNA chip on Bioanalyzer 2100 (Agilent). The prepared libraries were pooled and sequenced using NextSeq 500 (Illumina), generating ~15 million 76 bp single-end reads per sample.
RNA-seq data processing and determination of differentially expressed genes
RNA-Seq reads were aligned to the mouse genome using tophat [57,58] with parameters: -p 8 --read-mismatches 2 --b2-L 20 -g 5. An annotated list of known transcripts (from Genecode) was used for final set of transcriptome. The biotype of the transcripts (protein-coding or lncRNA) was based on Genecode annotation. The abundance and number of raw fragments aligned to each gene were computed using cufflinks [57]. The abundance of each gene was expressed as FPKM (Fragments Per Kilo base of transcript per million mapped reads).
To obtain differentially expressed genes between two conditions, we used DESeq [59]. The raw counts mapped to a transcript were obtained using htseq-count [60] and were used as inputs to the DESeq tool. To remove genes with low expression profiles, we included only the ones with total aligned fragment count across all the samples greater than 5. We used p-value of less than 0.05 and absolute log2(fold_change) > 1 as the criteria to determine differentially expressed genes between two conditions. Enriched GO terms were obtained using DAVID [61].

Experiment data files

This data is associated with the following publication:

Coregulatory long non-coding RNA and protein-coding genes in serum starved cells

Published in: BBA - Gene Regulatory Mechanisms

Latest version

  • Version 1


    Published: 2018-12-04

    DOI: 10.17632/65j75yvx6w.1

    Cite this dataset

    Liu, Yu; Wang, Fan; Soibam, Benjamin; Yang, Jin; Liang, Rui (2018), “Data for: Coregulatory long non-coding RNA and protein coding genes in serum starved cells”, Mendeley Data, v1


Views: 66
Downloads: 0


Animal Cell


CC BY 4.0 Learn more

The files associated with this dataset are licensed under a Creative Commons Attribution 4.0 International licence.

What does this mean?
You can share, copy and modify this dataset so long as you give appropriate credit, provide a link to the CC BY license, and indicate if changes were made, but you may not do so in a way that suggests the rights holder has endorsed you or your use of the dataset. Note that further permission may be required for any content within the dataset that is identified as belonging to a third party.