Bangla Speech Personality Traits Data

Published: 17 April 2024| Version 1 | DOI: 10.17632/fb6dm3yb6m.1


The different prominences of the collected data are given shortly: Data Types: Audio Data size: 1750 (speech-text-phase), 1000 (speech-base-phase) Linguistic diversity: Short text Audio capturing quality: 44.1 KHz, Mono We created a new personality traits dataset for our research work because there is a noticeable absence of datasets for automatically assessing personality from Bangla Speech. This data, processed with Machine Learning models, demonstrated that different personality produce varying magnitudes at different frequencies, exhibiting distinct patterns.


Steps to reproduce

To reproduce this data or using the data, one should use the two different folders. Both the Raw data and Processed data are stored there. Performing Automatic Spectrums Analysis, Using different Machine Learning Models can be useful with the data. The data acquisition process is described in detail in the section on data collection. • Overview: The data records by using Voice Recorder app (version: v21.4.16.01) and also ensure an isolated room. • Voice Recorder app Functionality: Recording quality 44.1 kHz, 128kbps, mono. • Domain and Check: Bangla Newspapers and Novels, checking data by the Psychology or Bangla department students/professor. • Speaker: Professional Speaker (for both acted and non acted speech). • Data Acquisition Process: After configuring settings, data collection based on pre generated text. • Data Storage: Collected data is temporarily saved in sd card. After completion, it is stored on google drive for easy access by a computer. • Recheck: Final audio quality check by the professional. • Algorithm Integration: The code includes algorithms converting signals to numeric values based on inbuilt function. It also facilitates data storage on the google drive.


BRAC University


Computer Science, Natural Language Processing, Machine Learning, University Student, Personality Assessment, Wavelet Denoising, Sentiment Analysis, Transformer-Based Deep Learning