MMA Fighter Detection Dataset: Annotated UFC Stand-Up Combat Images for Computer Vision
Description
Dataset Description: This dataset addresses a significant gap in computer vision applications for combat sports by providing the first publicly available collection of annotated MMA fighter images specifically designed for object detection within the octagon environment. The primary research motivation is to enable the development of AI-based judging and analysis systems for combat sports, particularly mixed martial arts. Research Hypothesis and Purpose: The creation of this dataset is driven by the hypothesis that accurate, automated detection and tracking of fighters during combat can serve as a foundation for AI-assisted judging systems in MMA. Such systems could potentially provide objective metrics on fighter positioning, movement patterns, strike exchanges, and octagon control. What the Data Shows: The dataset contains 5,106 high-definition images extracted from 20 professional UFC fights, featuring elite fighters across multiple weight classes. Each image is annotated with bounding boxes identifying fighters using the YOLOv8 format. The data specifically captures stand-up striking exchanges, representing the most visually dynamic and scoring-relevant portions of MMA competition. Data Collection Methodology: Images were systematically extracted from official UFC broadcast footage available on UFC's YouTube channels. The fights were selected to ensure diversity in: - Fighter body types and fighting styles (orthodox vs. southpaw, aggressive vs. defensive) - Weight classes (bantamweight through light heavyweight) - Camera angles and broadcast lighting conditions - Octagon environments and venue settings All images were resized to 640x640 pixels and manually annotated using the Roboflow platform. The single-class annotation approach (class: "fighter") simplifies the detection task while maintaining sufficient complexity for real-world application development. Notable Characteristics and Limitations: A key feature of this dataset is its deliberate focus on stand-up striking scenarios. Ground grappling and cage clinch situations have been intentionally excluded from this version. This design choice makes the dataset particularly suitable for: - Strike detection and counting algorithms - Fighter positioning and movement analysis - Octagon control metrics - Stand-up engagement classification Researchers interested in comprehensive MMA analysis including ground game scenarios should note this limitation and may need to supplement with additional data. How to Interpret and Use the Data: The dataset follows standard YOLOv8 formatting conventions. Each image in the train/valid/test splits has a corresponding .txt annotation file containing normalized bounding box coordinates (center_x, center_y, width, height) for each detected fighter. The included data.yaml configuration file enables immediate integration with Ultralytics YOLOv8 training pipelines.