HCsig

Published: 15 August 2025| Version 2 | DOI: 10.17632/d45hf6nxfv.2
Contributors:
Ingrid Hedenfalk,

Description

Ovarian cancer lacks effective screening methods, often resulting in late diagnosis and poor outcomes. Recognizing that high-grade serous ovarian carcinoma (HGSC) is driven by copy number alterations (CNAs) and that tumor DNA can be detected in cervical samples, we analyzed CNAs from shallow whole genome sequencing of 212 cervical samples from 128 women with/without HGSC, including 29 germline BRCA1/2 mutation carriers. Using the machine-learning classifier SRIQ, we developed HCsig, a predictor for HGSC detection. HCsig correctly identified HGSC in 79% of archival cervical samples, including 91% stage I-II (0-27 months before diagnosis), and 77% stage III-IV (0-65 months before diagnosis). Validation in 172 independent samples (0-98 months) showed 76% sensitivity and 94% specificity (AUC=0.83), including high sensitivity for early-stage cancers. We show that applying the HCsig classifier to pre-diagnostic cervical samples, including from non-symptomatic women several years before diagnosis, is feasible and holds promise for early-stage detection of HGSC. Data description: Folders contain shallow whole genome sequencing processed data: 1. Segmented files (50kb_segmentation) 2. Copy number files (50kb_copynumbers) 3. Absolute Copy number files (50kb_Rascal_absolute_CN). 4. CN features compiled file (50kb_CN_features_matrix.txt) 5. Metadata file (MaNiLaMasterFile_250130.xlsx)

Files

Categories

Ovarian Cancer, Machine Learning, Genome, Early Diagnosis, Genome Sequencing

Licence