CrowdHuman Dataset with YOLO Annotations

Name: CrowdHuman Dataset with YOLO Annotations
Creator: Madhuri Dharrao
Published: 2026-05-26T07:28:07.488Z
Keywords: Computer Vision, Object Detection, Crowd Analysis, Computer Imaging, Smart City, YOLOv5

Dharrao, Madhuri; dharrao, deepak

doi:10.17632/tttr2h6pz4.1

CrowdHuman Dataset with YOLO Annotations

Published: 26 May 2026| Version 1 | DOI: 10.17632/tttr2h6pz4.1

Contributors:

,

Description

The dataset contains approximately 12,000 training images, 1,500 validation** images, and 1,500 testing images**, following a standard 70% / 15% / 15% train-validation-test split. The dataset includes a single class, person, and all images are maintained in their original resolutions without resizing to preserve real-world scene characteristics. The dataset is designed for human detection in crowded environments and contains diverse scenarios with varying illumination, occlusion, motion blur, and dense pedestrian distributions. Some images may include partially visible humans, overlapping persons, blurred regions, and complex backgrounds, making the dataset suitable for robust object detection research. This cleaned and annotated version of the dataset includes: Removal of missing, duplicate, and corrupted images. Human bounding box annotations generated using a pre-trained YOLOv8 model. Annotation files provided in standard YOLO .txt format for direct training compatibility. Additional annotations exported in CSV format for analysis and reuse in custom pipelines. The dataset is beginner-friendly and can be effectively used for: Human detection projects Crowd analysis research Object detection model training Educational and academic purposes Benchmarking lightweight detection frameworks

Files

Steps to reproduce

The dataset was derived from the CrowdHuman dataset, containing images of crowded human scenes. Original annotations were converted into YOLO format with normalized bounding box coordinates for human detection (class_id = 0, person). The dataset was organized into training and validation folders following the YOLO directory structure. Preprocessing and annotation conversion were performed using Python, while model training and evaluation were conducted using the Ultralytics YOLO framework to ensure reproducibility.

Institutions

Symbiosis International University
Maharashtra, Pune

CrowdHuman Dataset with YOLO Annotations

Description

Files

Steps to reproduce

Institutions

Categories

Licence