PREWID: PRE-industry Worker Image Dataset for Industrial Environments
Description
The PREWID (Pre-Industry Worker Image Dataset for Industrial Environments) is a curated collection of grayscale images and video sequences designed to support the development, testing, and benchmarking of human detection and computer vision algorithms in "pre-Industry 4.0" settings. The dataset specifically addresses the technical challenges of legacy manufacturing environments, characterized by medium-to-low sensor resolutions, highly variable lighting conditions, and limited edge computational resources. It provides a balanced set of sequences containing both empty scenes and active workers captured under real-world conditions from multiple perspectives, as well as a controlled laboratory sequence for baseline validation. FORMAT - Images: PNG grayscale (8-bit). Native resolutions: 640×480 px (CAM0–CAM3) and 1920×1080 px (CAMLAB01). - Video: MP4 format (CAMLAB01_Sequence.mp4). - Annotations: CSV master files and individual TXT files containing bounding box coordinates under the normalized YOLO standard. DATA ORGANIZATION & STRUCTURE The dataset is organized into 2 main directories: 1. 01_Images: Contains raw sequential frames extracted from the deployment. - CAM0, CAM1, CAM2, CAM3: Image sequences from the industrial plant. - CAMLAB01: 3,000 sequential frames from the laboratory validation setup, along with the source video file. 2. 02_GroundTruth: Centralizes all coordinate and annotation tracking files. - GT_CAM0 to GT_CAM3, and GT_CAMLAB01: Subfolders with individual .txt files per frame. Empty files indicate no human presence. - groundtruth.csv: Master annotation file for the industrial environment (CAM0 to CAM3). - groundtruth_CAMLAB01.csv: Master annotation file for the 3,000 laboratory baseline frames. PRIVACY & ANONYMIZATION To fulfill strict privacy regulations, camera angles and heights were strategically selected to capture workers from distance, profile, or overhead perspectives. Facial features are non-distinguishable across all frames, making the dataset fully anonymized while preserving key behavioral, postural, and spatial attributes. USE CASE Benchmarking Deep Learning models (such as YOLO variants) and traditional computer vision approaches for worker safety monitoring and resource-efficient edge inference. ***For detailed information regarding the multi-row logic of the CSV files and coordinate recovery, please refer to the Readme.txt file included in the repository.***
Files
Institutions
- Universitat Politècnica de ValènciaValencia, Valencia