ConViD — Concrete Visual Defect Dataset

Published: 23 April 2026| Version 3 | DOI: 10.17632/fx3rthfjhy.3
Contributors:
,
,
, Aashray Maheshwari

Description

This dataset represents a large and diverse set of real-world photos showing different types of concrete surface defects, such as spalling honeycombing voids, and cracks. The images were taken with regular consumer-grade mobile phones outside under different weather and light conditions in Pune, India. This makes the dataset very relevant to real-world situations and very useful for example in the field. The data is organized in a way that makes it easy to develop and compare automated diagnostic systems based on deep learning, especially convolutional neural networks (CNNs). Unlike most existing datasets which focus on identifying fractures only, this dataset enables the multi-class classification of different types of defects that not only look different but also require different repair methods as per the engineering standards like ACI. One of the major characteristics of this dataset is its multi-layered complexity. Several types of defects, especially spalling, cracks, and honeycombing, visually look very similar, which makes the separation between them a difficult task. This not only poses a complex problem for computational models but also serves as a strong benchmark for evaluating fine-grained image categorization, feature representation learning, and domain adaptation methods. First round of testing show significant overlapping characteristics among different classes which is a further proof for it being a difficult experimental setting. Thanks to its real acquisition conditions and variability, this dataset is perfect for improving models that can be run on commonly used hardware, for instance, smartphones, and it is these kinds of situations where the gap between research and practical structural assessment is getting close

Files

Steps to reproduce

1. Objective To create a primary image dataset of four structural concrete defects: Honeycombing Voids Cracks Spalling Images were collected under controlled but naturally varying environmental conditions to ensure robustness and domain realism. 2. Equipment Specifications Device: Smartphone camera Resolution Mode: 1:1 (square aspect ratio) Camera Resolution: 12 Megapixels Image Format: JPEG Capture Mode: Default camera mode (no HDR enhancement filters) Zoom: No digital zoom used Flash: Disabled 3. Data Collection Strategy 3.1 Number of Expeditions Data was collected over three independent site visits (expeditions) to reduce sampling bias. 3.2 Time-of-Day Variation Each expedition included image acquisition at: Morning (Natural diffuse light) Afternoon (High-intensity direct sunlight) Night (Artificial lighting conditions) This ensured: Illumination variability Shadow variation Texture contrast differences Real-world deployment robustness 4. Image Capture Protocol For each defect instance: Camera positioned perpendicular (~90°) to surface when possible. Distance maintained between 0.5 – 1.5 meters depending on defect size. Multiple angles captured: Frontal view Slight oblique (15°–30° tilt) No artificial cleaning or surface alteration was performed. Background clutter was minimized where feasible. 5. Class Definitions 5.1 Honeycombing Exposed aggregate Surface cavities Poor compaction texture 5.2 Voids Circular or irregular cavity formation Depth perceptible under shadow 5.3 Cracks Linear fracture patterns Width varying from hairline to visible structural cracks 5.4 Spalling Concrete surface detachment Flaking with exposed internal layers 6. Environmental Variability Control The following variations were intentionally preserved: Lighting intensity (lux variation across time) Shadow presence Surface moisture (if naturally present) Surface dust conditions No post-processing correction was applied at collection stage.

Categories

Concrete Road

Licence