Dataset containing smoking and not-smoking images (smoker vs non-smoker)

Published: 18-07-2020| Version 1 | DOI: 10.17632/7b52hhzs3r.1
Ali Khan


The dataset contains a total of 2400 raw images, where 1200 images are of smoking (smokers) category and remaining 1200 images belong to no-smoking (non-smokers) category. The dataset is curated by scanning through various search engines by entering multiple keywords that include cigarette smoking, smoker, person, coughing, taking inhaler, person on the phone, drinking water etc. We tried to consider versatile images in both classes for creating a certain degree of inter-class confusion in order to better train the model. For instance, smoking category consists of images of smokers from multiple angles and various gestures. Moreover, the images in not-smoking category contains images of non-smokers with slightly similar gestures as that of smoking images such as people drinking water, using inhaler, holding the mobile phone, biting nails etc. The dataset can be used by the prospective researchers to propose machine learning algorithms for automated detection and screening of smoker towards ensuring the green environment and performing surveillance in smart cities.