BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by Saint(Sadhu) and Common(Cholito) Form of Bengali Language

Name: BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by Saint(Sadhu) and Common(Cholito) Form of Bengali Language
Creator: Umme Ayman ayman
Published: 2024-12-09T21:53:05.709Z
Keywords: Data Science, Natural Language Processing, Machine Learning, Bengali Language, Sentence Processing

ayman, Umme Ayman; Saha, Chayti; Mawa, Zannatul

doi:10.17632/7rx9mk8v4m.3

BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by Saint(Sadhu) and Common(Cholito) Form of Bengali Language

Published: 9 December 2024| Version 3 | DOI: 10.17632/7rx9mk8v4m.3

Contributors:

Umme Ayman ayman,

,

Description

This BanglaBlend dataset is a comprehensive collection of Bangla (Bengali) sentences meticulously categorized based on two specific forms: Saint(Sadhu) and Common(Cholito). This dataset is comprised of a total 7350 annotated Bangla sentences as well as it is preprocessed dataset where several data preprocessing techniques have been applied. This dataset is designed to facilitate research and development in natural language processing (NLP) and computational linguistics, particularly for Bangla, a widely spoken language in Bangladesh and parts of India. Specially, this dataset can be leveraged for several natural language processing task such as text summarization, text classification, sentiment analysis, automatic language translation.

Files

Institutions

Daffodil International University

BanglaBlend: A Large-Scale Nobel Dataset of Bangla Sentences Categorized by Saint(Sadhu) and Common(Cholito) Form of Bengali Language

Description

Files

Institutions

Categories

Licence