Hate speech driven emotion recognition dataset for Bangla text
Description
Hate speech detection and emotion recognition are emerging research fields for Natural Language Processing (NLP). The text that expresses hate speech also convey some emotion. Hence, hate speech and emotion recognition are related research areas. Therefore, attention is necessary to combine both fields in the same research. With this motivation, a multilevel dataset for natural Bangla text is proposed for Bangla hate speech and emotion recognition. The dataset contains 16407 statements collected from natural comments and categorized into two labels. Firstly, it is categorized as either hate or non-hate. Then, each category is further divided into three emotions: happy, sad, and angry. The dataset is validated with Cohen’s Kappa and Fliess’ Kappa measures that show good scores. The dataset can be applicable to machine learning and natural language processing tasks for Bangla text.