Single cell RNA sequence RSEM data from TCGA

Published: 11 January 2022| Version 2 | DOI: 10.17632/pr9j7x7nmh.2
Contributors:
Ferdous Mahin Kazi, MD Robiuddin, Mujahidul Islam

Description

PanClassif, a method that requires very few and effective genes to detect cancer from RNA-seq data and is able to provide performance gain in several wide range machine learning classifiers. We have taken 22 types of cancer samples from The Cancer Genome Atlas (TCGA) having 8287 cancer samples and 680 normal samples. The expression counts of the datasets were calculated using RSEM. Folders with "_smoothed" names contain smoothed data which are smoothed using "k-NN smoothing". A python package is made available for use (https://pypi.org/project/panclassif/#description).

Files

Categories

RNA Sequencing

Licence