Supplementary Data for: Scalp Tape-Strip RNA Sequencing Captures Disease and Treatment-Responsive Signatures in Alopecia Areata
Description
This dataset accompanies the manuscript “Scalp Tape-Strip RNA Sequencing Captures Disease and Treatment-Responsive Signatures in Alopecia Areata”. It provides the processed transcriptomic data and analysis scripts used to characterize the molecular landscape of alopecia areata (AA) through a minimally invasive tape-strip approach. A total of 61 RNA-seq profiles were analyzed, including scalp tape-strip and bulk biopsy samples from healthy controls and patients with AA across the clinical spectrum, before and after treatment with oral baricitinib. Gene expression was profiled using the Ion AmpliSeq™ Transcriptome Human Gene Expression Panel v2.0, aligned to the human reference genome (hg19_AmpliSeq_Transcriptome_v1.1.04042016.Designed.bed), normalized with voom, and batch-corrected with ComBat (sva package). The dataset includes: 1) counts_batch_corrected_ComBat_bySymbol.csv — log₂-normalized, batch-corrected expression matrix (genes × samples). 2) metadata_final.csv — clinical and experimental metadata (group, severity, treatment, sample type). 3) custom_gene_sets_AA_tapestrips.gmt — curated gene sets representing key immune (IFN/JAK–STAT, Th1/Th2/Th17/Th22, cytotoxic/NK) and epithelial (follicular, keratinization, fibrosis, proteostasis) pathways.¡ 4) /scripts folder — R scripts for GSVA, DEG analysis, and figure generation. The tape-strip RNA-seq method captured the primary immune and epithelial signatures observed in lesional biopsies—highlighting strong activation of the interferon/JAK–STAT and cytotoxic T/NK pathways, suppression of follicular and keratinization programs, and molecular normalization in post-baricitinib responders. These data support the translational use of tape-strip transcriptomics as a noninvasive biomarker and disease-monitoring tool in alopecia areata. All files are ready to reproduce the analyses and figures described in the publication using standard R (v4.3.2) workflows.
Files
Steps to reproduce
Starting dataset This repository provides the batch-corrected, voom-normalized expression matrix (voom_ComBat_corrected_matrix.csv) derived from tape-strip and bulk RNA-seq data of alopecia areata (AA) and control scalp samples. The matrix represents log₂-transformed normalized expression values for all genes passing quality filters (≥1 CPM in >50% of samples). Batch effects were adjusted using ComBat from the sva package after voom normalization. Gene sets and metadata Custom functional gene sets (gene_sets_GSVA_custom.gmt) include modules for IFN/JAK–STAT, Th1/Th2/Th17/Th22, cytotoxic/NK, antigen presentation, keratinization, follicular differentiation, fibrosis, lipid metabolism, and proteostasis pathways. Clinical and sample metadata (sample_metadata.csv) include group, severity, and treatment variables for GSVA stratification. Analytical workflow All analyses were conducted in R (v4.3.2) following the sequence of scripts provided in the /scripts folder: 01_GSVA_analysis.R – computes GSVA enrichment scores using the provided gene sets. 02_DEG_analysis.R – identifies differentially expressed genes (limma, FDR < 0.05). 03_visualization.R – generates volcano plots, heatmaps, boxplots, and correlation analyses. 04_summary_tables.R – compiles GSVA and DEG summary statistics. Statistical methods Differential expression was computed with limma using moderated linear models and Benjamini–Hochberg FDR correction. GSVA enrichment scores were compared between disease groups using one-way ANOVA with Tukey’s HSD post-hoc test. Tape-strip versus biopsy correlations were evaluated by Spearman’s r with bootstrap 95% CIs (2000 resamples). Output and reproducibility Running the scripts sequentially reproduces all major figures and statistical tables included in the manuscript: volcano plots (Figure 1B–F), heatmaps (Figure 1H), GSVA boxplots (Figure 2), and correlation analyses (Figure 3). The repository is self-contained; no additional data beyond the provided expression matrix, gene sets, and scripts are required.
Institutions
- Instituto Maimonides de Investigacion Biomedica de Cordoba
- Hospital Universitario Reina Sofia
- Icahn School of Medicine at Mount Sinai