Bangla Dialect Dataset: Exploring Linguistic Diversity Across Regions

Published: 3 February 2025| Version 1 | DOI: 10.17632/sm63ryv5dt.1
Contributors:
Md. Julkar Naeen, Sourav Kumar Das, Supta Das Dip Supta, Tabib E Alahi, Md Faisal Tajwar Faisal, Abdullah Al Rahat, Samurtha Jahan Ayshe Samurtha Jahan Ayshe, Ummay Mahjabeen Ummay Mahjabeen, Md Tanvir Tahmid, Mayen Uddin Mojumdar

Description

The dataset is a Bangla dataset containing sentences from different parts of Bangladesh. There are total of 7 columns. Each of the columns presents a different sentence from a different region. The other column contains actual Bangla. The dataset is not balanced.

Files

Steps to reproduce

You can simply convert the data from actual Bangla to regional Bangla.

Institutions

  • Daffodil International University

Categories

Natural Language Processing, Bengali Language

Licence