UK dataset of self-referential valence, imageability and subjective frequency ratings of 300 adjectives for use in cognitive-emotional tasks

Published: 4 November 2022| Version 2 | DOI: 10.17632/kgk3jbx9xb.2


The present dataset provides subjective ratings of valence, imageability and frequency for 150 positive and 150 negative adjectives describing personality characteristics. The words in this dataset can be used for the development or validation of existing or novel experimental tasks used in a wide range of cognition research. In Phase 1 of data collection, an initial sample of 100 participants provided self-referential valence ratings for a list of 482 adjectives depicting personality characteristics. These ratings were averaged across the sample to facilitate the exclusion of ambiguous words rated neither negative nor positive and produce a final list of 300 words (150 negative and 150 positive). In Phase 2 of data collection, we sought to further characterise these 300 words with three separate online surveys collecting ratings of self-referential valence, imageability and subjective frequency. A further 102 participants provided self-referential valence ratings, 200 participants provided imageability ratings and 202 participants provided subjective frequency ratings. Basic demographics and data on depressive symptoms and state anxiety were collected from all participants; see Tables 1a and 1b. The raw ratings collected in each of the four surveys are provided in the "Raw Datasets" folder, and the exact surveys used are provided in Supplementary file 1. We computed a series of statistics (mean, standard deviation, standard error, number of ratings received, median, minimum rating, maximum rating, range, skew, kurtosis) for each type of rating for each of the 300 personality descriptors. The statistics for self-referential valence, imageability, subjective frequency and word length were merged into a final dataset (see Positive and negative personality descriptor words dataset). We pooled scores from all participants for the reported statistical analyses, based on exploratory analyses showing age, gender and depression/anxiety symptoms had little effect on participant ratings (see Figures 2-8). However, if greater stratification is desired, specific population statistics can be re-calculated from the raw datasets. The R script we used for data cleaning and analysis is provided in Supplementary file 2. We also explored the relationship between the initial self-referential valence ratings collected in Phase 1 (first Qualtrics survey) and those collected during Phase 2 (second Qualtrics survey) for our final list of 300 words. We found the mean ratings for each word to be highly correlated between the two surveys (Spearman’s rho = 0.97, p < .01; see Figure 9). Additionally, we conducted a mixed effects analysis of variance to statistically assess the effects of data collection phase on the self-referential valence ratings acquired for each personality descriptor (see Self-referential valence reliability dataset). Only fully anonymised data is provided – all pseudonymous variables have been removed by the research team prior to sharing.


Steps to reproduce

Please use the R script provided in Supplementary File 2 to reproduce the final dataset, tables and figures from the raw data.


University of Oxford Department of Psychiatry, University of Oxford


Word Processing, Cognition, Memory, Cognitive Function, Cognitive Neuroscience, Emotional Memory, Cognitive Effect, Cognitive Bias, Memory Bias, Word Recognition, Episodic Memory for Words, Word Frequency Effects in Episodic Recognition, Word List, Cognitive Change