EmoTweetID: Indonesian Emotion Tweet Dataset

Name: EmoTweetID: Indonesian Emotion Tweet Dataset
Creator: Kuncahyo Setyo Nugroho
Published: 2025-12-04T22:36:02.508Z
Keywords: Natural Language Processing, Affective Computing

Setyo Nugroho, Kuncahyo; Abdurrachman Bachtiar, Fitra; Firdaus Mahmudy, Wayan; Martianus Henry, Matthew; Isnan, Mahmud; Pangestu, Gusti; Pardamean, Bens

doi:10.17632/jzgnjsff9f.5

EmoTweetID: Indonesian Emotion Tweet Dataset

Published: 4 December 2025| Version 5 | DOI: 10.17632/jzgnjsff9f.5

Contributors:

Kuncahyo Setyo Nugroho, Fitra Abdurrachman Bachtiar, Wayan Firdaus Mahmudy, Matthew Martianus Henry, Mahmud Isnan, Gusti Pangestu, Bens Pardamean

Description

The EmoTweetID dataset is a publicly available resource of Indonesian tweets collected from X (formerly Twitter) using emotion-related keywords. The dataset consists of three main components: 1. EmoTweetID-Corpus.csv: 3,126,987 unlabeled tweets for unsupervised tasks such as word embedding construction. 2. EmoTweetID-Lexicon.csv: 2,243 tweets automatically annotated using the Indonesian NRC EmoLex. 3. EmoTweetID-Human.csv: 2,243 tweets manually annotated by three psychology students, with inter-annotator agreement measured using Cohen’s and Fleiss’ Kappa. Both annotated files (EmoTweetID-Lexicon.csv and EmoTweetID-Human.csv) provide labels following Ekman’s six basic emotions: anger, disgust, fear, joy, sadness, and surprise. Additionally, two pre-trained word embedding models (Wors2Vec and FastText) trained on the corpus, TweetID-Word2Vec.zip and TweetID-FastText.zip, are provided for various downstream NLP tasks. All code used to construct the dataset is available in the GitHub repository: https://github.com/ksnugroho/EmoTweetID This dataset offers a valuable benchmark for affective computing and natural language processing in Indonesian, supporting research in emotion recognition, social media analysis, and the development of empathetic AI systems.

Files

Institutions

Bina Nusantara University
Universitas Brawijaya

EmoTweetID: Indonesian Emotion Tweet Dataset

Description

Files

Institutions

Categories

Related Links

Licence