CSS: cluster similarity spectrum integration of single-cell genomics data
Description
It is a major challenge to integrate single-cell sequencing data across experiments, conditions, batches, timepoints and other technical considerations. New computational methods are required that can integrate samples while simultaneously preserving biological information. Here, we propose an unsupervised reference-free data representation, Cluster Similarity Spectrum (CSS), where each cell is represented by its similarities to clusters independently identified across samples. We show that CSS can be used to assess cellular heterogeneity and enable reconstruction of differentiation trajectories from cerebral organoid and other single-cell transcriptomic data, and to integrate data across experimental conditions and human individuals. The presented data set here includes 1) the published two-month-old human cerebral organoid scRNA-seq data (Kanton et al. 2019 Nature); 2) the published time course human cerebral organoid development scRNA-seq data (Kanton et al. 2019 Nature); 3) the single-cell RNA-seq data of cerebral organoid generated by inDrop; 4) the newly generated single-cell RNA-seq data of cerebral organoids with and without fixation conditions; 5) example script to generate results reported in the manuscript