Uddessho: An Extensive Benchmark Dataset for Multimodal Author Intent Classification in Low-Resource Bangla Language

Published: 4 September 2024| Version 1 | DOI: 10.17632/mzxmt8tfjs.1
Contributors:
,
,
, Md Morshed Alam Lipson, Asif Iftekher Fahim, Md. Moinul Hoque

Description

The "Uddessho" dataset, meaning "Intent" in English, is designed for multimodal author intent classification. It contains 3048 post instances categorized into six intent types: Informative, Advocative, Promotive, Exhibitionist, Expressive, and Controversial. The dataset is divided into a training set with 2423 posts, a testing set with 313 posts, and a validation set with 312 posts, totaling 3048 posts. Distribution of Dataset Splits Across Different Intent Categories ========================================= Intent Taxonomy Training Testing Validation ========================================= Informative 514 67 67 Advocative 386 49 49 Promotive 315 43 42 Exhibitionist 371 47 48 Expressive 518 66 66 Controversial 319 41 40 ========================================= Total 2423 313 312 =========================================

Files

Institutions

  • Ahsanullah University of Science and Technology

Categories

Social Media, Bengali Language, Multimodality, Intention, Deep Learning

Licence