Single cell RNA sequence RSEM data from TCGA
PanClassif, a method that requires very few and effective genes to detect cancer from RNA-seq data and is able to provide performance gain in several wide range machine learning classifiers. We have taken 22 types of cancer samples from The Cancer Genome Atlas (TCGA) having 8287 cancer samples and 680 normal samples. The expression counts of the datasets were calculated using RSEM. Folders with "_smoothed" names contain smoothed data which are smoothed using "k-NN smoothing". A python package is made available for use (https://pypi.org/project/panclassif/#description).