ANUBHUTI: A COMPREHENSIVE CORPUS FOR SENTIMENT ANALYSIS IN BANGLA REGIONAL LANGUAGES

Published: 26 June 2025| Version 1 | DOI: 10.17632/mjxwby94yw.1
Contributors:
,
,
,

Description

ANUBHUTI, a comprehensive dataset consisting of 2,000 sentences manually translated from standard Bangla into four major regional dialects—Mymensingh, Noakhali, Sylhet, and Chittagong. The dataset predominantly features political and religious content, reflecting the contemporary socio-political landscape of Bangladesh, alongside neutral texts to maintain balance. Each sentence is annotated using a dual annotation scheme: (i) multiclass thematic labeling categorizes sentences as Political, Religious, or Neutral, and (ii) multilabel emotion annotation assigns one or more emotions from Anger, Contempt, Disgust, Enjoyment, Fear, Sadness, and Surprise.

Files

Institutions

Ahsanullah University of Science and Technology

Categories

Dialect, Bengali Language, Sentiment Analysis

Licence