Human Wellbeing and Machine Learning
Supplementary Material for Human Wellbeing and Machine Learning by Ekaterina Oparina (r) Caspar Kaiser (r) Niccolò Gentile; Alexandre Tkatchenko, Andrew E. Clark, Jan-Emmanuel De Neve and Conchita D'Ambrosio This repository contains the list of variables that are used in the Extended Set analysis for the German Socio-Economic Panel, the UK Household Longitudinal Study, and the American Gallup Daily Poll. The variables are grouped into categories, the summary table is reported at the beginning of the document. We use the 2013 Wave of Gallup and SOEP, and Wave 3 of the UKHLS (which covers 2011-2012). Our dataset includes all of the available variables, apart from direct measures of subjective wellbeing (such as domain satisfaction, happiness, or subjective health) or mental health and technical variables (e.g. id numbers). We also exclude variables with more than 50% missing values. The presented lists include the variables before processing. For the analysis, we convert categorical variables into a set of dummies, one for each category. We then drop all perfectly collinear variables.