Reddit Ideology Database

Published: 22 September 2023| Version 2 | DOI: 10.17632/2tdr9sjd83.2


Dataset with articles posted in the r/Liberal and r/Conservative subreddits. In total, we collected a corpus of 226,010 articles. We have collected news articles to understand political expression through the shared news articles.


Steps to reproduce

All articles (Raw) - 226,010 Sampled class-balanced articles - 45,108 Annotated articles - 4,000 Each zip file has two files: Liberal.json and Conservative.json Cite: Ravi, K., Vela, A. E., & Ewetz, R. (2022, December). Classifying the Ideological Orientation of User-Submitted Texts in Social Media. In 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA) (pp. 413-418). IEEE.


University of Central Florida


Computational Linguistics, Social Media, Social Collaborative Computing, Computer Modeling in Social Science, Ideology