Gene expression dynamics during the embryonic development of mouse and chick telencephalon

Published: 20-12-2019| Version 1 | DOI: 10.17632/rdt5757cbw.1
Vijaykumar Yogesh Muley,
Carlos Javier López-Victorio,
Jorge Tonatiuh Ayala Sumuano,
Adriana González-Gallardo,
Leopoldo González-Santos,
Carlos Lozano-Flores,
Gregory Wray,
Maribel Hernández,
Alfredo Varela


In mouse, a vesicle forms at the anterior part of the developing embryo at 9.5 day/stage (M9.5), which further develops into telencephalon at M10.5 day/stage. The equivalent stages in chick embryo are referred to as HH17 and HH24. We sequenced RNA populations from telencephalic region at the early (M9.5 and HH17) and late stages (M10.5 and HH24) of mouse and chick embryo. Four samples were sequenced for each stage of telencephalon development. The resulting RNA sequencing reads were used to assemble transcripts and for counting their abundance. The read counts for each transcript then used to compute its differential expression between M9.5 and M10.5 stages in mouse. Likewise, each chick transcript was compared between H17 and HH24 stages. The raw read counts for transcripts and their differential expression results between early and late stages of development are provided here.


Steps to reproduce

RNA-sequencing read counts data: Total RNA from mouse and chick telencephalon was processed with the TrueSeq RNA-seq library prep and sequenced in the Illumina HiSeq 2500 sequencer (50 pb paired-end). Reads were mapped to the mouse genome (mm10) or the chick genome (galGal5) with Tophat and RNA counts were obtained with HTseq by employing the Ensemble GTF files corresponding to the genome versions. An ortholog gene pair list between mouse and chick was obtained from Ensemble and complemented by additional pairs identified by ProteinOrtho. RNA sequencing read counts for 12,902 pairs of orthologues genes having one-to-one mapping in mouse and chick genome were analyzed further. Differential gene expression analysis: A gene was considered expressed if its count-per-million (CPM) value was above 5.66-7 in mouse, and 6.72-7 in chick at least in four samples/libraries. The CPM value cut-off is equal to 5/L, where L is the minimum library size in millions (Chen et al., 2016). The CPM cut-off was roughly equal to five reads or above in at least four libraries out of eight libraries irrespective of the developmental stages under consideration. The counts of expressed genes were normalized using the quantile method with the Voom function in the Limma package (Smyth et al., 2016). The resulting data was then used to compute differential gene expression using the moderated t-statistic method available in the Limma package. The p-values derived from t-statistics were subsequently corrected by the Benjamini and Hochberg approach, and genes with corrected p-value equal to or less than 0.05 were considered differentially expressed. The directionality of the change in expression of genes was determined by their log2-fold change values between stages. Genes having significant p-values with positive log2 -fold change represent increased expression at developmental stage B (late) compared to stage A (early) and are referred to as up-regulated (UP). Likewise, genes with negative log2-fold change represent decreased expression at stage B compared to stage A and are referred to as down-regulated (Down, DN). Gene expression with p-values above 0.05 was considered non-significant and represents no change between stage B and stage A and is referred to as no change (NC). Genes with read count roughly less than five in less than four samples out of eight (as described above) are considered not expressed and referred to as NE. These four groups of genes were further categorized into sixteen groups based on the expression status of mouse and chick gene orthologs (column “group” in differential expression analysis table). The genes were also classified in column "cortex" in five categories based on their cortical layer specific expression (Ayoub et al., 2011).