Data for: The Big Picture of Cities: Analysing Flickr Photos of 222 Cities Worldwide Using Machine Learning

Published: 1 May 2020| Version 1 | DOI: 10.17632/kvgwpdzkn5.1
Viriya Taecharungroj, Boonyanit Mathayomchan


The scope of the dataset is 222 cities in the Mercer’s quality of living (QoL) ranking 2019. The photos were retrieved from To detect the objects and features of every photo, the authors used Google Cloud Vision, which is a recent technology that collects, analyses, and extracts information from visual images. All photos of the 222 cities were processed to detect their labels. The authors conducted latent Dirichlet allocation (LDA) modelling, which is the most common feature extraction or topic modelling algorithm in machine learning. Each photo was assigned the city image dimension including cityscape, landscape, architecture, transport, and recreation. The table shows (1) the identification number of the city ordered by the Mercer’s QoL Ranking (2) the name of the photo file from 1-1,000, (3) the concatenated label of each photo, (4) the number of labels, (5) the Flickr ID of the photo, (6) the owner ID, (7) the type of license, (8) the date that the photo was taken, (9) view count, (10) latitude, (11) longitude, (12) LDA results, (13) the city image dimension.



Social Sciences