SiMo-Fish: Aerial Imagery Dataset for Behavioral Analysis of Flathead Grey Mullet (Mugil cephalus)
Description
The SiMo-Fish aerial image dataset has been selected and annotated to facilitate the training and validation of artificial intelligence models. For example, it is used to recognize and segment the flathead Grey Mullet (Mugil cephalus). As a basis, this dataset enables the creation of automated monitoring systems that enable the accurate identification, counting, and behavioral description of fish populations in captivity. The primary objective of this research is to develop computer vision software that serves as reliable, non-invasive identifier of individual fish which can be used in further studies or assessments of behaviour. By automatically acquiring quantitative values and behavioral patterns, SiMo-Fish supports precision aquaculture, promotes ethical fish management, and promotes non-invasive monitoring.
Files
Steps to reproduce
The SiMo-Fish dataset was created through a reproducible methodology combining image acquisition, precise annotation, dataset diversification, and behavioral analysis. High-resolution images were collected at the IRTA facilities in la Ràpita (Spain) using a suspended camera system positioned above aquaculture circular tanks to capture top-down (zenithal) views, reducing occlusions and ensuring visibility of all specimens. Complementary underwater perspectives were obtained with the By Barlus POE HD Underwater Surveillance System with Easy Remote Access, allowing continuous monitoring and consistent image capture. Each mullet individual was annotated using bounding boxes in YOLOv11 format, providing efficient training data for object detection, and pixel-perfect segmentation masks, delivering detailed instance-level annotations for morphological and positional analysis. To enhance robustness and model generalization, the dataset integrates diverse scenaria including lighting variations (natural and artificial), differences in fish density (low, medium, high), and water turbidity levels (clear to cloudy). This variability ensures that models trained on SiMo-Fish are resilient to real-world aquaculture conditions. The dataset is also designed for behavioral research: annotations enable the extraction of metrics such as swimming speed, direction, and trajectories at both individual and group levels, while inter-individual distances serve as potential indicators of stress, aggregation, or social cohesion. By enabling the automated and objective quantification of behavior, SiMo-Fish extends beyond classical detection tasks, offering a tool for advancing fish welfare research through computer vision. SiMo-Fish provides the scientific community and industry stakeholders with a reliable resource for developing automated monitoring systems, supporting the transition toward precise, data-driven, and ethical management of marine resources.
Institutions
- Institut de Recerca i Tecnologia AgroalimentariesCatalunya, Caldes de Montbui
- Universidad Autonoma de ChiapasChiapas, Tuxtla Gutierrez