RNA Sequencing-Based Single Sample Predictors of Molecular Subtype and Risk of Recurrence for Clinical Assessment of Early-Stage Breast Cancer
Gene expression data and associated supplementary files from RNAseq of breast cancer samples from Staaf et al. (source reference below). Library preparation for mRNA-sequencing was done by a stranded dUTP mRNA protocol or by Illumina stranded TruSeq mRNA protocol. Expression data (Fragments Per Kilobase per Million reads, FPKM) was generated by an analysis pipeline utilizing Hisat/StringTie with GRCh38 human genome primary assembly and GENCODE Release 27 transcripts/genes. Gene expression data is summarized on GENCODE gene identifier. Gene and transcript definitions and gene annotations are from GENCODE Release 27. Detailed description including material and methods for RNAseq, Hisat/StringTie analysis pipeline, and the development of the Single Sample Predictor (SSP) models for Breast Cancer is available in Staaf et al. (source reference below). The developed SSP models are available as an R package available at GitHub (reference below).