Sentiment Analysis Dataset of Seblak Ndjedir Reviews

Published: 20 April 2026| Version 1 | DOI: 10.17632/pnb94tfpyy.1
Contributors:
andini aulia putri andini aulia putri,

Description

This dataset comprises customer reviews collected from the Seblak Njedir food outlet, including reviewer names, visit details, ratings, and textual feedback on menu items, taste, service, and overall experience. The dataset is structured to support natural language processing tasks such as text summarization, sentiment analysis, and opinion mining. Researchers can utilize this dataset for developing, testing, and benchmarking models that extract insights from customer feedback, analyze consumer preferences, or improve recommendation systems in the food service and hospitality domain. Keywords: Customer reviews, food review dataset, sentiment analysis, text summarization, NLP, Seblak Njedir, opinion mining, restaurant feedback.

Files

Steps to reproduce

1. Data Collection: Gather customer reviews directly from the Seblak Njedir food outlet’s feedback forms, social media, or online platforms. Include the following fields: reviewer name, visit date or identifier, rating (numerical), and review text. 2. Data Cleaning: Remove duplicate reviews and irrelevant entries. Normalize text by correcting spelling errors and standardizing character encoding. Strip unnecessary symbols or formatting artifacts (e.g., newlines, emojis if not needed). 3. Data Structuring: Organize the cleaned data into a tabular format with columns: Nama (Name), Rating, Visit_Info (optional), Review_Ulasan (Review Text). Save the structured dataset in a common format such as .xlsx or .csv. 4. Data Preprocessing for NLP (Optional for Research Use): Tokenize review text into sentences or words. Apply text normalization, such as lowercasing and removing stopwords if necessary. Optionally, generate embeddings (e.g., using Word2Vec or GloVe) for downstream analysis like sentiment classification or summarization. 5. Validation: Verify the dataset integrity by ensuring each review has a corresponding rating and non-empty text. Check for consistent formatting across all entries. 6. Usage: The dataset can now be directly used for natural language processing tasks such as extractive summarization, sentiment analysis, opinion mining, or exploratory analysis of customer preferences.

Categories

Artificial Intelligence, Natural Language Processing, Text Extraction

Licence