Naive pluripotent stem cell-based models capture FGF-dependent human hypoblast lineage specification.

Published: 31 May 2024| Version 1 | DOI: 10.17632/bg92x53rk4.1


10X processed scRNA-seq matrices and H5 files from Cell Ranger pipeline for Dattani et al. (2024), Cell Stem Cell: "Naive pluripotent stem cell-based models capture FGF-dependent human hypoblast lineage specification". Sequencing was performed on Novaseq 6000, generating paired end reads (28x10x10x90 bases). Sample demultiplexing, alignment, and quantification of barcode counts was performed using Cell Ranger Multi. Cell Ranger .h5 files from individual samples are attached in this dataset together with a .xls containing sample information. Scanpy was used to read and analyse raw read counts from the Cell Ranger output. Cells expressing fewer than 15,000 counts, more than 100,0000 counts, or more than 15% mitochondrial reads were filtered out. The resultant count matrix was normalised and log-transformed using reciple_zheng_17. The top 1000 highly variable genes were identified for initial dimensionality reduction with PCA prior to non-linear dimensionality reduction using UMAP. Cells visualised in UMAP plots were coloured according to individual marker gene expression values, and the Leiden algorithm (resolution 0.8) was used to identify cell clusters. Outlier clusters were discarded that likely corresponded to Mouse Embryonic Fibroblasts (DR4 MEFs) or contained dying cells with >~12% mitochondrial reads. Attached are two .h5ad.gz files contain normalised counts matrices used for producing UMAPs in Fig1 and Fig5 of doi: Also included are H5 files from Cell Ranger output. For any questions regarding this data, please email: or



Single-Cell Transcriptomics