Bangla Dialect Dataset: Exploring Linguistic Diversity Across Regions
Published: 3 February 2025| Version 1 | DOI: 10.17632/sm63ryv5dt.1
Contributors:
Md. Julkar Naeen, Sourav Kumar Das, Supta Das Dip Supta, Tabib E Alahi, Md Faisal Tajwar Faisal, Abdullah Al Rahat, Samurtha Jahan Ayshe Samurtha Jahan Ayshe, Ummay Mahjabeen Ummay Mahjabeen, Md Tanvir Tahmid, Mayen Uddin MojumdarDescription
The dataset is a Bangla dataset containing sentences from different parts of Bangladesh. There are total of 7 columns. Each of the columns presents a different sentence from a different region. The other column contains actual Bangla. The dataset is not balanced.
Files
Steps to reproduce
You can simply convert the data from actual Bangla to regional Bangla.
Institutions
- Daffodil International University
Categories
Natural Language Processing, Bengali Language