Bengali Abusive Language Based on Feminism
Description
Digital platforms facilitate creating, sharing, and exchanging content and information through virtual networks and communities. There are also instances of offensive remarks made in these digital venues, which can undermine constructive discourse and promote cyberbullying. Regrettably, in Bangladesh, feminist voices often face disproportionate harassment and abuse on digital platforms. This offensive language not only creates online violence but also prevents women’s freedom. However, highlighting offensive language helps feminism show how words support inequality and works towards more respectful communication for everyone. To support the researchers in Natural Language Processing (NLP), a comprehensive dataset has been created that consists of 6,830 abusive Bengali-language comments collected from social media platforms—Facebook, Instagram, and Twitter—focused on feminist issues and gender-related discussions. The dataset was assembled to aid studies in hate speech analysis, abusive language identification, and sociolinguistic patterns of online gender-based harassment in low-resource languages like Bangla. In order to gather data, public postings, hashtags, and discussion threads related to feminism, gender equality, and women's rights had to be found. Each comment has been manually labeled as positive, negative (abusive), or neutral, based on its sentiment and relevance to feminist discourse, and it was reviewed by native Bangla speakers to ensure the presence of offensive content and contextual relevance.
Files
Institutions
- Southeast University