Urban Civic Issues Image Dataset: Potholes and Garbage (QR4Change)

Published: 23 September 2025| Version 2 | DOI: 10.17632/zndzygc3p3.2
Contributors:
,
,
,

Description

This dataset has been developed to support research in computer vision for urban infrastructure monitoring and waste management, as part of the project QR4Change: A Smart QR-Based Civic Grievance Reporting System. The project aims to provide a technology-driven platform where citizens can conveniently report civic issues through QR codes, while automated image analysis assists municipal authorities in prioritizing and addressing complaints. The images were collected from diverse sources, including open-source repositories, government portals, and on-field surveys in Pune (covering regions such as Kondhwa, Bibewadi, Swargate, and Market Yard). The dataset is organized into two major categories: Pothole Dataset: A total of 2,966 images, consisting of 1,004 pothole images and 1,962 plain road (non-pothole) images. Garbage Dataset: A total of 1,971 images, consisting of 712 garbage dump images and 1,259 non-garbage images. This dataset not only underpins the QR4Change project but is also intended to serve the wider research community in developing and evaluating machine learning models for tasks such as image classification, object detection, and smart city civic issue analysis.

Files

Steps to reproduce

Methods and Data Collection The dataset was compiled using a combination of field surveys, open-source repositories, and government portals. Field Data Collection: Images were captured in Pune, India, across four regions: Kondhwa, Bibewadi, Swargate, and Market Yard. A standard smartphone camera was used, and photographs were taken from multiple angles and distances under natural lighting conditions. This ensured diversity in perspective, scale, and background. Open-Source & Government Data: Additional images were sourced from freely available datasets, repositories, and municipal portals. Keywords such as “pothole,” “road damage,” “garbage dump,” “clean road,” and “waste-free area” were used for scraping relevant images. Only openly licensed and sufficiently high-resolution images were included. Dataset Composition: Potholes: 1,004 pothole images + 1962 plain road images (2,966 total). Garbage: 712 garbage dump images + 1,259 non-garbage images (1,971 total). Processing Protocol: Duplicates and low-quality images were removed. Images were manually reviewed and labeled into two classes for each task (pothole/non-pothole, garbage/non-garbage). The final dataset was structured into separate directories for each class to support direct use in machine learning workflows. This process ensures that the dataset is diverse, well-annotated, and reproducible, enabling its use for computer vision research in pothole and garbage detection.

Institutions

Vishwakarma Institute of Information Technology

Categories

Artificial Intelligence, Computer Vision, Image Processing, Deep Learning

Licence