Hausa Emotion-Tagged Tweets Dataset for Multi-Label Emotion Classification

Published: 8 October 2024| Version 1 | DOI: 10.17632/xcyv6dyhtx.1
Contributors:
Hassan Adamu, Masrah Azrifah Azmi Murad, Nurul Amelina Nasharuddin

Description

The dataset comprises 19,757 Hausa tweets, each annotated with 11 distinct emotions: anger, sadness, disgust, fear, surprise, joy, trust, optimism, pessimism, anticipation, and neutrality. This dataset addresses a significant gap in Natural Language Processing (NLP) for underrepresented languages like Hausa and is specifically designed for multi-label classification tasks. Each tweet captures complex emotional states, making the dataset ideal for training machine learning and transformer-based models that can detect multiple emotions simultaneously. The tweets were collected through Twitter’s API, focusing on culturally significant events to reflect a broad range of emotional responses. Native Hausa speakers manually labeled the tweets, ensuring high-quality annotations that accurately represent the complex emotional expressions common on social media. This dataset is essential for advancing emotion classification in low-resource languages and aiding the development of more robust models for multi-label text emotion classification.

Files

Institutions

Universiti Putra Malaysia

Categories

Computer Science, Natural Language Processing, Machine Learning, Artificial Intelligence Applications, Deep Learning

Funding

Petroleum Technology Development Fund

Licence