Bangla Multilabel Cyberbully, Sexual Harrasment, Threat and Spam Detection Dataset

Name: Bangla Multilabel Cyberbully, Sexual Harrasment, Threat and Spam Detection Dataset
Creator: Saieef Sunny
Published: 2024-07-16T19:11:03.778Z
Keywords: Natural Language Processing, Bengali Language, Bullying, Sexual Harassment

Sunny, Saieef; Ahmed, Rezwan; Habib, Ahashan; Kabir, Sajida; Masud, Md Abdulla al Masud

doi:10.17632/sz5558wrd4.2

Bangla Multilabel Cyberbully, Sexual Harrasment, Threat and Spam Detection Dataset

Published: 16 July 2024| Version 2 | DOI: 10.17632/sz5558wrd4.2

Contributors:

Saieef Sunny, Rezwan Ahmed, Ahashan Habib, Sajida Kabir, Md Abdulla al Masud Masud

Description

This dataset is curated to facilitate detecting and classifying various types of cyberbullying in Bangla social media text. It includes a collection of comments along with their associated attributes and labels indicating the presence of different types of abusive content. The dataset can be used to train and evaluate machine learning models for multi-class classification tasks, specifically targeting the detection of bullying, sexual harassment, religious hate speech, threats, and spam. Dataset Columns Gender: Indicates the gender of the person who faced the bullying. (e.g., female) Profession: Indicates the profession of the person who faced the bullying. (e.g., dancer) Comment: Contains the text of the comment in Bangla. (e.g., "এই দেশে এইসব কি হচ্ছে" which translates to "What is happening in this country") Bully: Binary label indicating whether the comment contains bullying content. (0 for no, 1 for yes) Sexual: Binary label indicating whether the comment contains sexual harassment content. (0 for no, 1 for yes) Religious: Binary label indicating whether the comment contains religious hate speech. (0 for no, 1 for yes) Threat: Binary label indicating whether the comment contains threats. (0 for no, 1 for yes) Spam: Binary label indicating whether the comment is considered spam. (0 for no, 1 for yes)

Bangla Multilabel Cyberbully, Sexual Harrasment, Threat and Spam Detection Dataset

Description

Files

Categories

Licence