KurdABSA: Aspect Based Sentiment Analysis Dataset for Kurdish Language
Published: 6 August 2025| Version 1 | DOI: 10.17632/h5t7p4bcj2.1
Contributor:
Rania AzadDescription
The dataset is the first publicly available aspect-based sentiment analysis dataset for the Sorani dialect of Kurdish, addressing a critical gap in natural language processing (NLP) research for low-resource languages. The dataset comprised more than 4000 quadruplet ABSA in the restaurant review domain, written in the Kurdish language (Sorani dialect) using the Perso-Arabic script. The dataset was automatically annotated using a few-shot and prompt based model. This resource is intended for use in machine learning, deep learning, and cross-lingual model adaptation, making it suitable for training, fine-tuning, and benchmarking.
Files
Steps to reproduce
Please cite the dataset's paper if you use this dataset: (coming Soon)
Institutions
Sulaimani Polytechnic University
Categories
Natural Language Processing, Sentiment Analysis