Ina-SASet: dataset for developing Indonesian sentiment lexicon for extracting consumer reference based on fine grained sentiment analysis technique

Published: 4 September 2023| Version 1 | DOI: 10.17632/52h3d44n85.1
Bagus Setya Rintyarna,


Consumer preference desribe how consumer defines the suitable product to be obtained for their necessity. Extracting consumer preference enable us to depict the acceptance of consumer toward a specific commercial product. In term of the potential consumer, revealing consumer preference would assist them to select the the product of interest as recommended by the previous consumer. Likewise, consumer preference is also important for companies to advance both the product design and the quality improvement lining with consumer interest. Mainly, consumer preference is determined based on pencil survey by interviewing respondents to respond distinct description of products along with their level of attribute. It accounts the construct that any product can be represented in terms of its feature or characteristics that has any different level could be taken. The technique is, however, considered to be consuming, human intensive and costly. The collection of this dataset is part of an attempt to model consumer preference automatically by using dataset from online platform by employing fine grained lexicon-based sentiment analysis method. Ina-SASet is, hence, an initial dataset for developing Indonesian sentiment lexicon for extracting consumer preference based on fine grained sentiment analysis technique. The dataset contains raw data (Tweets) and pre-processed data involving: tokenizing, stop word removal and stemming.



Text Mining, Sentiment Analysis