Serum proteomic profiles in keratoconus, post-laser vision correction ectasia, and pellucid marginal degeneration

Published: 15 February 2026| Version 1 | DOI: 10.17632/yzsktrxh6y.1
Contributors:
Katarzyna Jaskiewicz-Rajewicz,

Description

This dataset accompanies the manuscript titled 'Common and distinct features in serum proteomic profiles in keratoconus, post-laser vision correction ectasia, and pellucid marginal degeneration’ by Jaskiewicz-Rajewicz et al., submitted to Investigative Ophthalmology & Visual Science (IOVS). In this study serum samples from 93 patients with keratoconus (KTCN), 10 patients with post-laser vision correction (PLVC) ectasia, 4 patients with pellucid marginal degeneration (PMD), and 44 controls were profiled using tandem matrix-assisted laser desorption/ionization time of flight/time of flight mass spectrometry (MALDI-TOF/TOF MS/MS). Clinical and environmental variables (e.g. allergy/atopy/asthma, eye rubbing intensity), together with ophthalmologic parameters (K1, K2, Kmax, TCT, anterior/posterior elevation), were assessed in the principal component analysis (PCA), Weighted Gene Co-expression Network Analysis (WGCNA), Mann–Whitney feature testing, linear modelling, and Spearman correlation and other analyses. The data was processed on 10 December 2025. Mendeley Supplementary Table 1. Table containing sample metadata, including Sample ID (short), corresponding Patient ID, study subgroup, sex, and age. Mendeley Supplementary Table 2. Table containing unprocessed m/z peak intensity values detected in analyzed spots, indexed by the corresponding Sample ID (short).

Files

Steps to reproduce

Sample processing and MALDI-TOF/TOF MS/MS Blood was collected into serum tubes, processed by double centrifugation, aliquoted, and stored at −80 °C until analysis. Prior to mass spectrometry, serum aliquots were concentrated, desalted, and purified using C18 ZipTip microcolumns; eluates were mixed with α-cyano-4-hydroxycinnamic acid matrix and spotted in triplicate onto AnchorChip plates. MALDI-TOF spectra were acquired in linear-positive mode on a Bruker UltrafleXtreme instrument across m/z 1,000–10,000 using 2,000 laser shots per spectrum. External calibration was applied, and mass accuracy was monitored (average mass deviation ≤ 100 ppm). For peak identification, selected samples underwent MALDI-TOF/TOF MS/MS following in-solution tryptic digestion and database searches against SwissProt. Data preprocessing Spectra were exported to tabular form and normalized according to Total Ion Current (TIC). M/z peaks missing in >30% of samples were removed. For the retained peaks, missing intensities were imputed as one-half of the minimum intensity observed for that peak across all samples. Before the statistical analyses, zero-variance features were removed. Statistical analyses All computations, embracing proteomic and clinical data, were performed in R environment with the use of packages: tidyverse, rstatix, broom, and WGCNA. Principal Component Analysis (PCA) was conducted with data scaling (z-score standardization of mean-centered proteomic data). To assess associations between PCA components and clinical covariates, linear regression models were fitted with PC1 and PC2 as dependent variables. For each model, regression coefficients, 95% confidence intervals, and standard errors were calculated. Per-feature statistical testing was performed using Mann–Whitney U (Wilcoxon rank-sum) test statistics. Weighted Gene Co-expression Network Analysis (WGCNA) was performed in R using WGCNA package. Prior to network construction, all m/z peak intensities were Z-score scaled across samples. A signed co-expression network was constructed using a soft-thresholding power of 6, selected according to the scale-free topology criterion. Modules were detected using dynamic tree cutting with a minimum module size = 10. For each module, module eigengenes (MEs) (first principal component) were computed and subsequently averaged within each diagnostic subgroup. Module stability was evaluated using the median and interquartile range (IQR) of module eigengenes across diagnostic subgroups and tested with the Kruskal–Wallis rank sum test followed by Dunn’s post-hoc test.

Categories

Mass Spectrometry, Proteomics, Matrix-Assisted Laser Desorption-Ionization, Corneal Ectatic Disease

Funders

Licence