Military objects in military environments
Description
The dataset will consist of images of tanks, drones, soldiers, and civilians. The final dataset used to train our object detection model consists of 7985 images. Distribution of images and instances for each class. Instances refer to the number of targeted objects in the images, such as tanks, drones, soldiers, or people. The number of images and instances of each has been carefully considered to provide the best possible result for various factors. Class Number of Images Number of Instances Tanks 3000 4990 Drones 1359 1296 People 2644 4492 Soldiers 982 3240 Each of the classes includes the diversification of the images and instances across many conditions, including but not limited to: • Clarity • Weather • Occlusion • Time of day • Illumination • Image quality • Discernibility • Instance count
Files
Steps to reproduce
The dataset combines public and synthetic data generated by the GTA5 simulator. Public datasets for tanks, drones, and people were sourced from reliable platforms like RoboFlow and Kaggle. For soldiers, synthetic data was created using GTA5, which simulates realistic military environments. We use two of the three techniques, synthetic data collection, and image augmentations, to support our collected training data and build a comprehensive collection of images of soldiers in conditions that simulate real-world battlefields to a great extent. We use the ”GTA5” videogame, an open-sandbox photo-realistic video game with realistic adaptations of different weather conditions and times and a wide range of terrain to collect synthetic data of soldiers. The game provides a ”Director Mode,” which allows the outfitting of characters with military-style gear and outfits and the customization of the environment with high flexibility. We used this to gather 982 images of soldiers and augment the images to expand the dataset further. Images are annotated using the Computer Vision Annotation Tool (CVAT). Bounding boxes, such as tanks, drones, soldiers, and people, are drawn around each object to label their location and ensure accurate object detection. Each labeled instance is then verified for accuracy to prevent errors in model training. Various data augmentation techniques are applied to increase the dataset’s diversity and ensure the model can generalize well under varying conditions. These include: • Scaling: Adjusting the size of objects. • Rotation: Randomly rotating images to simulate different angles. • Translation: Shifting images horizontally or vertically. • Flipping: Horizontally flipping images to simulate different orientations. • Noise Addition: Adding random image noise to make the model more robust. • Padding: Extending the boundaries of images to create varied sizes. • Cutout: Randomly masking parts of the image to improve robustness.