BanglaKBase
Description
BanglaKBase is a structured affective commonsense knowledge base developed specifically for Bengali concepts. Each entry within the resource is represented as a key in a dictionary, linked to a set of sentiment-related attributes. These include a primary emotion and a secondary emotion (e.g., #anger, #disgust), which together reflect both the dominant and supporting affective tones associated with the concept. In addition, a polarity label (positive or negative) indicates the general sentiment orientation, while a polarity score—ranging from -1 (strongly negative) to +1 (strongly positive)—quantifies the intensity of sentiment. To capture deeper contextual meaning, each concept is further enriched with five semantically related Bengali terms. These associated terms provide a broader view of the concept’s affective landscape and enhance its usability in computational analysis. For example, the concept ‘অনেক বেশি চরবি’ (excessively overweight) is labeled with strong negative emotions (#anger and #disgust), carries a polarity score of -0.97, and is linked to semantically related terms such as ‘মোটা’, ‘বিচ্ছিরী’, and ‘ভারি’, all reinforcing its negative emotional tone. The full knowledge base includes 30,000 annotated Bengali concepts. Among these, 15,126 are marked as positive and 12,492 as negative. Beyond polarity, concepts span a wide array of emotion categories including sadness (7,862), disgust (8,933), joy (8,926), surprise (4,570), anger (6,139), fear (3,132), admiration (10,306), and interest (7,208). These emotional dimensions enable nuanced affective analysis of text in Bengali and support a variety of tasks in sentiment analysis and emotion-aware computing. Illustrative entries further demonstrate the granularity and contextual depth captured by BanglaKBase. For instance, the term ‘একটু’ is classified as negative with a polarity score of -0.79 and is linked to contextually relevant terms such as ‘অন্তত’, ‘ছোট’, and ‘ক্ষতিকারক’. In contrast, a similar concept like ‘একটু ক্ষুধার্ত’ is categorized as positive with a score of 0.65, and is connected to expressions indicating satisfaction or hunger relief. Such examples underscore how the resource integrates sentiment and semantics to model affective meaning more comprehensively. Overall, BanglaKBase serves as a valuable tool for sentiment classification, emotion mining, and natural language understanding in Bengali, supporting both computational research and practical applications in affective computing.
Files
Steps to reproduce
1. Collect Bengali text data from online sources like news sites and social media. 2. Extract concepts using syntactic parsers and concept extraction tools. 3. Annotate each concept with primary and secondary emotions. 4. Assign polarity labels (positive/negative) and polarity scores (-1 to +1). 5. Link five semantically related Bengali terms to each concept. 6. Structure each entry as a dictionary format with emotional and semantic attributes. 7. Review and refine the entries through manual validation.