Reassessment of Marker Genes in Human Induced Pluripotent Stem Cells for Enhanced Quality Control

Published: 5 November 2023| Version 1 | DOI: 10.17632/xs3pn527y9.1
Jochen Dobner


Author: Jochen Dobner E-Mail: Description: This data repository contains binary alignment map (BAM) files derived from Oxford Nanopore Technologies (ONT) long-read transcriptome sequencing of human induced pluripotent stem cells (iPSCs). Human iPSC12 (female) cells were differentiated into each of the three primary germ layers endoderm, ectoderm, and mesoderm by directed differentiation in biological duplicates. Total RNA was extracted, double stranded cDNA reverse transcribed, amplified and subjected to sequencing library preparation (Sequencing by Ligation Native Barcoding). Samples were sequenced on a PromethION flow cell on a P2 Solo device. Raw FAST5 files were base-called using guppy basecaller (v6.4.6) and aligned to the human reference genome (GRCh38) using minimap2 (v2.24). Transcript counting and differential gene expression analysis were performed using R (v4.2.2) packages Rsubread (v2.12.3) and edgeR (v3.40.2). It was chosen to upload reference-sorted BAM instead of FASTQ files to this repository, because ever growing data masses will likely become a bottleneck in the future. BAM files are smaller in size compared to FASTQ, but contain all their information additional to the alignment. BAM files can be reconverted to FASTQ files, e.g. by the SAMtools command 'samtools bam2fq INPUT.bam > OUTPUT.fastq. For further details, please contact the repository owner. The barcodes refer to the following samples: barcode01: iPSC12 Undifferentiated n1 barcode02: iPSC12 Endoderm n1 barcode03: iPSC12 Ectoderm n1 barcode04: iPSC12 Mesoderm n1 barcode05: iPSC12 Undifferentiated n2 barcode06: iPSC12 Endoderm n2 barcode07: iPSC12 Mesoderm n2 barcode08: iPSC12 Ectoderm n2 The title of this repository refers to the manuscript of the same title which is submitted for publication.


Steps to reproduce

Summary of the used reagents and tools: Cell line: Human induced pluripotent stem cell line iPSC12 (Cell Applications: iPS12-10) Differentiation protocol: StemMACS™ Trilineage Differentiation Kit, human (Miltenyi) Total RNA extraction: Trizol (Thermo) and Direct-zol RNA Miniprep (Zymo Research) double stranded cDNA (dscDNA) transcription and amplification: Template Switching RT Enzyme Mix (New England Biolabs); RT primer (5’->3’: AAG CAG TGG TAT CAA CGC AGA GTA CTT TTT TTT TTT TTT TTT TTT TTT TTT TTT TV); Template Switching Oligo (5’->3’: GCT AAT CAT TGC AAG CAG TGG TAT CAA CGC AGA GTA CAT rGrGrG); cDNA amplification primer (5’ -> 3’: AAG CAG TGG TAT CAA CGC AGA GT); 7 cycles PCR amplification; Exonuclease I (Thermo) treatment Purification: AMPure XP beads (Beckham Coulter) Library preparation: Native Barcoding Kit 24 V14 (Oxford Nanopore Technologies, SQK-NBD114.24, protocol version NBA_9168_v114_revE_15Sep2022); Input: 20,25 ng per sample Sequencing device: P2 Solo; PromethION flow cell; final library input: 15 fmol Basecalling: guppy basecaller (v6.4.6); settings: minimum PHRED 5 Alignemnt: minimap2 (v2.24); settings: -uf -k14 (noisy sequencing data) Counting of transcripts: R (v4.2.2) package Rsubread (v2.12.3); options: long sequencing reads Differential gene expression analysis: R package edgeR (v3.40.2); post-hoc adjustment: Benjamini-Hochberg (p <= 0.05)


Leibniz-Institut fur umweltmedizinische Forschung an der Heinrich-Heine-Universitat Dusseldorf gGmbH


Stem Cells Research, RNA Sequencing, Induced Pluripotent Stem Cell, Genetic Marker, Ectoderm, Endoderm, Mesoderm, Induced Pluripotent Stem Cell Biology, Pluripotent Stem Cell Differentiation, Next Generation Sequencing


Ministerium für Kultur und Wissenschaft des Landes Nordrhein-Westfalen

Bundesministerium für Bildung und Forschung

Deutsche Forschungsgemeinschaft

RO05380/1-1 and PR1527/6-1

Bundesministerium für Bildung und Forschung



Leibniz Competition (SAW) Cooperative Excellence project (K246/2019)