Augmented Resting-State EEG Dataset for Schizophrenia Diagnosis
Description
This repository contains an augmented resting-state electroencephalography (rs-EEG) dataset designed for the automated classification and diagnosis of schizophrenia (SZ). The data builds upon a previously published public dataset comprising 41 participants (20 Healthy Controls [HC] and 21 individuals diagnosed with Schizophrenia [SZ]), recorded using a 31-channel system at a sampling rate of 500 Hz. To address the limited number of SZ samples and mitigate the risk of deep learning model overfitting due to class imbalance, a data augmentation procedure was applied. This dataset includes 10 additional synthetic SZ recordings. These synthetic samples were generated by injecting carefully scaled Gaussian noise (with a standard deviation of 1e-6) into randomly selected raw EEG signals from the original SZ cohort. This specific perturbation level was chosen to be sufficiently small to preserve the underlying spatiotemporal and neurophysiological structures of the original EEG signals while providing enough variance to improve model generalization. Important Note for Machine Learning Applications: If you intend to use this dataset for training predictive models (e.g., Graph Neural Networks, CNNs), it is strongly recommended to include these augmented synthetic recordings exclusively in the training set. Including them in validation or test sets may lead to data leakage and result in overly optimistic performance metrics.
Files
Institutions
- Islamic Azad University, MashhadRazavi Khorasan, Mashhad