Multimodal-Tokopedia Review

Published: 12 June 2026| Version 1 | DOI: 10.17632/9zs3cgfhpp.1
Contributor:
Bella Miranda

Description

Multimodal Tokopedia Review is a collection of Tokopedia user reviews that integrates two modalities: textual data and visual data. The textual modality consists of comments or reviews written by users after completing transactions, while the visual modality comprises product images uploaded as part of the review. The combination of these two modalities enables a more comprehensive analysis, as sentiment information can be extracted not only from textual content but also from the accompanying visual content. This dataset is utilized to support multimodal sentiment analysis and aspect-based sentiment analysis tasks by simultaneously leveraging information contained in both text and images. Each review may reflect user experiences related to various aspects, such as product quality, seller service, delivery process, and product conformity. By integrating these two sources of information, the dataset provides a richer representation compared to unimodal approaches that rely solely on either text or images. In this study, the Multimodal Tokopedia Review dataset serves as the primary data source for feature extraction, aspect and sentiment labeling, data balancing using the Synthetic Minority Over-sampling Technique (SMOTE), and the development of a multimodal sentiment classification model. The use of this dataset is expected to enhance the model’s ability to capture the relationships between textual and visual information, thereby improving the accuracy and robustness of sentiment classification.

Files

Categories

Natural Language Processing, Sentiment Analysis, Multimodal Transformer

Licence