BANSpEmo: A Bangla Language Emotional Speech Recognition Dataset

Published: 30 May 2023| Version 2 | DOI: 10.17632/rdwn4bs5ky.2
Md Gulzar Hussain, Mahmuda Rahman, Babe Sultana, Ye Shiren


For languages with low resources like the Bangla language, BANSpEmo is the third audio dataset for emotional speech recognition (SER). BANSpEmo consists of 792 utterance recordings of six basic emotional reactions of two sets of sentences. Each set has six sentences. Speakers are explained the emotional states and utterances are recorded in a more realistic way than just reading the sentences. These emotional states are Disgust (বিতৃষ্ণা), Happy (খুশি), Sad (দুঃখজনক), Surprised (বিস্মিত), Anger (রাগ), Fear (ভয়). The produced corpus includes voice recordings from 22 unprofessional speakers, 11 of whom are male and 11 of whom are female. The audio recording was for two sets of sentences.



Green University of Bangladesh


Speech Recognition, Audio Signal Processing, Human-Computer Interaction, Text-to-Speech, Information-Processing of Emotion


Center for Research Innovation and Transformation (CRIT), Green University of Bangladesh