Bengali Vocal Spectrum (BVS): A Bengali Voice Dataset for Psychological Stability Analysis
Description
The Bengali Vocal Spectrum (BVS) dataset is a valuable resource for psychological stability analysis, featuring 85 audio recordings in the Bengali language, totaling approximately 4.65 hours. The dataset is divided into two groups: 43 recordings (2.63 hours) from psychologically unstable patients and 42 recordings (2.02 hours) from psychologically stable individuals. These recordings were collected at Pabna Mental Hospital, Pabna, and National Institute of Mental Health & Hospital, Dhaka. The dataset was compiled with a firm commitment to ethical standards, including obtaining all necessary permissions and ensuring the confidentiality and anonymity of all participants. The data was meticulously preprocessed using Audacity software to normalize the volume, remove background noise, and ensure clarity, with each file saved in .wav format at a 48000 Hz sample rate. Spectrogram images are in a separate folder and segmented into 2-second clips from the audio data for signal processing techniques, enhancing the utility of the dataset for machine learning applications. Researchers can utilize this dataset to develop or refine algorithms that assess psychological stability based on vocal characteristics. The data's format and quality also make it suitable for advanced signal processing tasks, which could lead to new insights into the acoustic markers associated with psychological conditions.
Files
Steps to reproduce
To reproduce the "Bengali Vocal Spectrum" dataset, begin by securing ethical approvals and necessary permissions from institutions like Pabna Mental Hospital, Pabna and NIMH & Hospital, Dhaka. Recruit participants identified by mental health professionals as psychologically stable or unstable. Conduct training sessions for these participants to ensure consistent audio quality during recordings. Use high-quality recording equipment in a controlled environment to capture the audio. Afterward, process each recording with Audacity software to normalize the volume, remove background noise, and convert files to .wav format at a 48000 Hz sample rate. Throughout the process, adhere strictly to ethical guidelines to maintain participant confidentiality and data integrity. This approach ensures that the research can be reliably reproduced and verified.