CITE-seq - MS Twin Study

Published: 21 December 2021| Version 1 | DOI: 10.17632/278fy5m2yj.1
Eduardo Beltrán


We performed single-cell transcriptome, surface proteome and TCR analyses using 10X Chromium Single Cell Immune Profiling Solution (5’ approach) with feature barcoding. We successfully sequenced 8 selected twin pairs discordant for MS (16 individuals in total) + 2 additional healthy MS Twins. This allowed us to pair the high-throughput CyTOF data from 116 Twin samples with a high-resolution map of highly multiplexed surface protein expression and an unbiased in-depth analysis of the transcriptome (single-cell indexing of transcriptomes and epitopes: CITE-seq).


Steps to reproduce

Cell Ranger software (10x Genomics, v.6.1) was used to demultiplex samples, process raw data, align reads to the GRCh38 human reference genome and summarize unique molecular identifier (UMI) counts. Filtered gene-barcode and CSP-barcode matrices that contained only barcodes with UMI counts that passed the threshold for cell detection were used for further analysis. Then, we processed the filtered UMI count matrices using the R package Seurat (version 4.0.3)73. Cells that expressed fewer than 500 genes and/or >15% mitochondrial reads, and genes expressed in less than 3 cells were removed from the count matrix. After QC, only raw gene counts in high-quality singlets were submitted to: log-normalization; identification of high variable genes by using the vst method; scaling; and regression against the number of UMIs and mitochondrial RNA content per cell. We applied an unbiased calculation of the k-nearest neighbors, generated the neighborhood graph and embedding using UMAP. Differentially expressed genes between each cluster and all other cells were calculated using the FindAllMarkers function. Annotation of Seurat clusters was manually curated using a combination of up-regulated genes for each cluster and visual inspection of key markers using UMAP visualization. After initial cluster annotation, we subsetted all clusters containing myeloid cells and reanalyzed this subset. After subsetting, integration using reciprocal PCA was performed to remove batch effects and the integrated assay was used for principal component analysis and unsupervised clustering. Seurat subclusters were annotated using a combination of canonical protein and mRNA markers. One of the eight MS twin samples did not contain sufficient numbers of phagocytes for analysis and led us to omit this twin pair for downstream analysis. Similar to myeloid cells, only CD4+ T cells where a TCR clonotype was detected, were subsetted and reanalyzed. Single-cell TCRs were computed from the TR-sequencing data using Cell Ranger vdj pipeline (10x Genomics, v.6.1). CD4+ T cells containing more than two β-chains were removed.


RNA Sequencing, Multiple Sclerosis, Twin Study