Indonesian COVID-19 Vaccination-related Tweets for Stance Detection and Aspect-based Sentiment Analysis

Published: 30 August 2022| Version 3 | DOI: 10.17632/7ky2jbjwtn.3


The dataset was collected using Twitter API services for specific keywords posted for ten months, starting from January 2021 to October 2021. The data has been filtered for non-Bahasa (Indonesia language), non-target-related, spam, and duplication. There are two labeling processes: stance and aspect-based sentiment. Three annotators manually labeled the sample data and used the majority vote strategy for the final class label. In our annotation strategy, for stance labeling, each annotator was asked to annotate the individual tweets as "Favor", "Against", or "Neutral" for COVID-19 vaccination programs in Indonesia. While for aspect-based sentiment labeling, each tweet has been annotated into seven predetermined aspects of the COVID-19 vaccination, namely "Services", "Implementation", "Apps", "Costs", "Participants", "Vaccine-products", and "General". Each predetermined aspect will have two possible sentiment values, between "Positive" and "Negative".



Institut Teknologi Sepuluh Nopember


Social Sciences, Computer Science, Natural Language Processing, Text Mining, Vaccination, Twitter, COVID-19