A Bangla Dataset to Detect Racism or Body Shaming from Sentence

Published: 18 September 2024| Version 3 | DOI: 10.17632/bn26f4px6y.3
Contributors:
,
,

Description

Body shaming and Racism are very common thing in a society. Here there are more than a thousand Bangla sentences that are used by people to humiliate their haters, friends, family, or nearest one. Moreover, there is also word-based data that is directly used in a sentence. Those were collected from a Facebook post and individual users via a Google form without asking for any personal information so that they can feel free to share the sentences in which they are facing embarrassing situations like body shaming and racism. The sentences were labeled with the terms 'Racism' and 'Body Shaming.' Additionally, there is a list of Bangla stop words that will help to do data preprocessing and cleaning.

Files

Steps to reproduce

Each district of Bangladesh has its own regional language, so those words/sentences can be added which are responsible for racism or body shaming. However, a few regional words have been added to this dataset.

Institutions

Daffodil International University

Categories

Data Mining, Natural Language Processing, Text Mining, Sentiment Analysis

Licence