CHN-ZZEHG: A Dataset of Uterine Electromyography Signals from Pregnant Women for Studying Human Labor Mechanisms and Preterm Birth Prediction

Published: 4 June 2026| Version 1 | DOI: 10.17632/s52z5nkp3j.1
Contributors:
Lele Wang, Zhenzhen Liu, Hu Luo, Zaiying Ma, Yuxuan Zheng, Bei Zhou, Feng Yan, Qian Wang, Yuxin Ran, Shaoshuai Wang, Chuanxu Song, Liguo Song, Jieyun Bai, Hongbo Qi, Yumian Lai, Huishu Liu,

Description

Background: Preterm birth is a leading cause of neonatal mortality and long-term health issues, yet clinical practice still lacks effective risk prediction tools. Electrohysterography (EHG), as a non-invasive and objective technique for monitoring uterine activity, shows promise in elucidating the mechanisms of labor onset and predicting preterm birth. Objective: To construct a large-scale, standardized public dataset, named CHN-ZZEHG, of uterine electromyography signals from pregnant women and to explore the feasibility of preterm birth prediction models based on this dataset. Methods: Recruitment of prospective 400 pregnant women was conducted from December 2024 to December 2025 at the Guangzhou Women and Children's Medical Center, affiliated with Guangzhou Medical University. A total of 444 high-quality EHG recordings were collected, covering various delivery outcomes such as spontaneous term labor, term non-labor, preterm birth, and medically indicated termination of pregnancy. All data were acquired following a unified acquisition protocol using a standardized four-channel abdominal electrode configuration and underwent preprocessing with 0.1–3 Hz bandpass filtering and 50 Hz notch filtering. Each recording was accompanied by detailed clinical metadata, including gestational age, mode of delivery, and medication use. Based on this, a support vector machine algorithm was used to construct a preterm birth prediction model. Results: Technical validation demonstrates the high reliability and scientific quality of this dataset: signal quality assessment indicated that over 92% of recordings achieved a "good" or higher rating. Physiological plausibility analysis revealed that with increasing gestational age, the average burst duration (DUR), power spectral density (PSD), and median frequency (MF) of EHG signals significantly increased, while sample entropy (SampEn) significantly decreased (P < 0.01). A preliminary preterm birth prediction model built on the CHN-ZZEHG dataset achieved an accuracy of 84.5%, precision of 82.1%, recall of 79.8%, and an F1-score of 0.81, validating the dataset's direct application value in prediction algorithm development. Conclusion: The CHN-ZZEHG dataset constitutes a large-scale, standardized, and clinically richly annotated benchmark resource, providing crucial support for in-depth exploration of human labor mechanisms and the development of novel preterm birth prediction methods. This dataset has been deposited in the PhysioNet open database and is publicly available under the CC-BY 4.0 license.

Files

Institutions

Categories

Electromyography, Prematurity Prevention, Early Labor, Uterine Disorder

Licence