Bangla Multilabel Cyberbully, Sexual Harrasment, Threat and Spam Detection Dataset
Description
This dataset is curated to facilitate detecting and classifying various types of cyberbullying in Bangla social media text. It includes a collection of comments along with their associated attributes and labels indicating the presence of different types of abusive content. The dataset can be used to train and evaluate machine learning models for multi-class classification tasks, specifically targeting the detection of bullying, sexual harassment, religious hate speech, threats, and spam. Dataset Columns Gender: Indicates the gender of the person who faced the bullying. (e.g., female) Profession: Indicates the profession of the person who faced the bullying. (e.g., dancer) Comment: Contains the text of the comment in Bangla. (e.g., "এই দেশে এইসব কি হচ্ছে" which translates to "What is happening in this country") Bully: Binary label indicating whether the comment contains bullying content. (0 for no, 1 for yes) Sexual: Binary label indicating whether the comment contains sexual harassment content. (0 for no, 1 for yes) Religious: Binary label indicating whether the comment contains religious hate speech. (0 for no, 1 for yes) Threat: Binary label indicating whether the comment contains threats. (0 for no, 1 for yes) Spam: Binary label indicating whether the comment is considered spam. (0 for no, 1 for yes)