Personality Ontology Corpus for Indonesian Social Media Text

Published: 17 August 2021| Version 1 | DOI: 10.17632/vmj7b9ppxx.1
Andry Alamsyah


The corpus contains a list of words/phrases in non-formal Indonesian language which mapped to specific traits and sub-traits of Big Five Personality traits. This list has been curated by psychologists and linguistic experts.


Steps to reproduce

The dataset contains the mapping of words/phrases to the sub-traits and traits of the Big Five Personality trait. The procedure of dataset creation is as follows: First, we collect the data (words/phrases) from many non-formal conversations in the Indonesian language on social media. Second, we map each of the data into suitable personality sub-traits and traits. The third step is data verification and validation by psychologists and Indonesia language experts


Social Media, Ontology, Personality Assessment, Big Five, The Five-Factor Model of Personality