Bengali Vocal Spectrum (BVS): A Bengali Voice Dataset for Psychological Stability Analysis

Published: 14 August 2024| Version 1 | DOI: 10.17632/s5j25b5tjk.1
Contributors:
Rafiul Islam,
,
,

Description

The Bengali Vocal Spectrum (BVS) dataset is a valuable resource for psychological stability analysis, featuring 85 audio recordings in the Bengali language, totaling approximately 4.65 hours. The dataset is divided into two groups: 43 recordings (2.63 hours) from psychologically unstable patients and 42 recordings (2.02 hours) from psychologically stable individuals. These recordings were collected at Pabna Mental Hospital, Pabna, and National Institute of Mental Health & Hospital, Dhaka. The dataset was compiled with a firm commitment to ethical standards, including obtaining all necessary permissions and ensuring the confidentiality and anonymity of all participants. The data was meticulously preprocessed using Audacity software to normalize the volume, remove background noise, and ensure clarity, with each file saved in .wav format at a 48000 Hz sample rate. Spectrogram images are in a separate folder and segmented into 2-second clips from the audio data for signal processing techniques, enhancing the utility of the dataset for machine learning applications. Researchers can utilize this dataset to develop or refine algorithms that assess psychological stability based on vocal characteristics. The data's format and quality also make it suitable for advanced signal processing tasks, which could lead to new insights into the acoustic markers associated with psychological conditions.

Files

Steps to reproduce

To reproduce the "Bengali Vocal Spectrum" dataset, begin by securing ethical approvals and necessary permissions from institutions like Pabna Mental Hospital, Pabna and NIMH & Hospital, Dhaka. Recruit participants identified by mental health professionals as psychologically stable or unstable. Conduct training sessions for these participants to ensure consistent audio quality during recordings. Use high-quality recording equipment in a controlled environment to capture the audio. Afterward, process each recording with Audacity software to normalize the volume, remove background noise, and convert files to .wav format at a 48000 Hz sample rate. Throughout the process, adhere strictly to ethical guidelines to maintain participant confidentiality and data integrity. This approach ensures that the research can be reliably reproduced and verified.

Institutions

Daffodil International University

Categories

Mental Health, Audio Recording, Voice Input, Mental Disorder, Medical Care in Bangladesh, Bengali Language, Audio Recognition, Mental State, Audio Signal Analysis, Human Voice, Voice Recognition

Licence