Horseradish and weed dataset from commercial fields in Southern Illinois
Description
This dataset contains annotated RGB images collected to develop and benchmark YOLO models for weed detection in commercial horseradish (Armoracia rusticana) production systems in Southern Illinois. The data support research on vision-based, real-time robotic weeding in a high-value specialty crop with limited herbicide options. This dataset is provided to accompany the manuscript submitted to Frontiers in Agronomy, titled ‘Evaluation of YOLO-based Weed Detection Models on Commercial Horseradish Fields in Southern Illinois'. Images were captured during the 2024 growing season from two commercial horseradish fields in Collinsville, Illinois, and one research plot (Illinois Autonomous Farm, UIUC). Two acquisition platforms were used: * A handheld Apple iPhone 13 Mini mounted on a monopod for proximal imaging between crop rows. * A Farm-ng Amiga mobile robot equipped with Luxonis OAK-D cameras for robotic imaging. Videos were recorded at 30 and 60 fps and further processed to convert to image frames. After curation and augmentation (e.g., flipping, saturation adjustments), the dataset contains approximately 2,696 images. Each image is annotated for object detection with two classes: i. Horseradish (crop); ii. Weed (composite non-crop class, including key species such as waterhemp, Amaranthus tuberculatus, and Palmer amaranth, Amaranthus palmeri, along with other broadleaf and grass weeds). Annotations are provided in a YOLO-compatible format to facilitate ease in training. The dataset was originally used to compare multiple YOLO variants (e.g., nano/small/medium models from different YOLO v8, v11, and v12). The performance metrics were based on accuracy (precision, recall, F1, mAP@50) and computational criteria (inference time, model size, GFLOPs). We also tested various compute platforms for potential deployment on embedded and edge-computing platforms. Intended uses include: * Training and evaluating crop-weed detection models in specialty crops. * Benchmarking lightweight YOLO object detectors for real-time inference. * Best tuned and trained model for testing inference. Along with the images and labels, we provided an example script (training_script.ipynb) showing the training and inference scripts and brief documentation describing the dataset structure, class definitions, and recommended preprocessing steps. The uploaded folder is arranged in the following file structure: Horseradish-weed-dataset ┣ data ┃ ┣ test ┃ ┃ ┣ images ┃ ┃ ┣ labels ┃ ┣ train ┃ ┃ ┣ images ┃ ┃ ┣ labels ┃ ┣ valid ┃ ┃ ┣ images ┃ ┃ ┣ labels ┃ ┗ data.yaml ┣ Horseradish_8n_200epochs_best.pt ┣ requirements.txt ┗ training_script.ipynb
Files
Steps to reproduce
Instructions are provided in the Python notebook file (training_script.ipynb). If you encounter any issues, please contact Abhinav Pagadala (asp14@illinois.edu) or Sunoj Shajahan (sunoj@illinois.edu)
Institutions
- University of Illinois Urbana-ChampaignIllinois, Urbana
Categories
Funders
- National Institute of Food and AgricultureUnited States Department of AgricultureWashingtonGrant ID: 2020-67021-32799