SubRide: Subjectivity Detection in Ride-Hailing App Reviews
Description
This dataset contains 1338 user reviews from ride-hailing applications, including Gojek, Grab, and Maxim, collected from the Google Play Store between November 2024 and March 2025. The reviews are manually labeled by 2 human annotators and annotated based on predefined labeling criteria, with each review classified as either Subjective (1) or Objective (0). All reviews have undergone basic preprocessing, including the removal of emojis and URLs. Here, we include dataset for annotator individuall classification and the final one from after discussion The primary goal of this dataset is to support research in subjectivity detection, with potential applications in natural language processing (NLP) and machine learning, particularly in the context of transportation-related services. This dataset can be used to train and evaluate classification models, analyze user feedback trends, and explore how subjectivity influences user-generated content.
Files
Steps to reproduce
The reviews were collected through web scraping from the Google Play Store, specifically targeting the review pages of Gojek, Grab, and Maxim ride-hailing apps. For Each row in the dataset represents a single user review and contains the following columns: -score: The rating given by the user (typically from 1 to 5) -app: The name of the ride-hailing app the review was taken from (e.g., Gojek, Grab, or Maxim) -review: The text content of the user’s review -translated_review: The English translation of the original user review -label: The subjectivity label for the review: a). 1 = Subjective b). 0 = Objective
Institutions
- Bina Nusantara University