Personality Ontology Corpus for Indonesian Social Media Text
The corpus contains a list of words/phrases in non-formal Indonesian language which mapped to specific traits and sub-traits of Big Five Personality traits. This list has been curated by psychologists and linguistic experts.
Steps to reproduce
The dataset contains the mapping of words/phrases to the sub-traits and traits of the Big Five Personality trait. The procedure of dataset creation is as follows: First, we collect the data (words/phrases) from many non-formal conversations in the Indonesian language on social media. Second, we map each of the data into suitable personality sub-traits and traits. The third step is data verification and validation by psychologists and Indonesia language experts