Automated Personality Prediction
Published: 12 February 2024| Version 1 | DOI: 10.17632/3sndbd4p84.1
Contributor:
Fatima HabibDescription
This is a dataset of preprocessed texts from Reddit Platform and their corresponding Big Five Scores for 1608 users of the platform with more than 27,000 comments.
Files
Steps to reproduce
The data is extracted from the personality-focused PANDORA dataset. The texts are segmented into 3 files necessary to train a large language model, validate it and finally evaluate the results.
Institutions
- National University of Computer and Emerging Sciences
Categories
Social Media, Personality, Computational Aspects, Language