MH-Weed16:An Indian Multiclass Annotated Weed Dataset for Computer Vision Tasks

Published: 15 September 2025| Version 2 | DOI: 10.17632/d3n3mgjjbv.2
Contributors:
Sayali Shinde, Dr. Vahida Attar, COEP Technological University Pune, IDEAS Technology Innovation Hub, Indian Statistical Institute Kolkata

Description

Weeds are invasive plants that compete with crops for vital nutrients and often attract pests, significantly impacting agricultural productivity. They account for approximately 45% of the annual productivity loss in farming. Manual weeding methods, while effective, are labor-intensive and financially burdensome, particularly for smallholder farmers. Conversely, excessive reliance on chemical herbicides has led to herbicide resistance in several weed species, creating additional challenges in weed management. Emerging technologies, particularly artificial intelligence (AI) and computer vision, are revolutionizing agriculture by automating labor-intensive tasks. Computer vision enables the precise identification of crops and weeds from images, supporting autonomous systems for selective weeding and targeted herbicide application. To develop robust AI models, high-quality datasets are critical. Addressing this need, we introduce the MH-Weed16 Image Dataset, collected from soybean fields in the Maharashtra region of India between July 2023 and November 2023 under diverse natural field conditions. The dataset comprises a total of 18,677 images of 16 weed species, annotated with the guidance of agricultural experts. It also includes 7,577 representative crop samples, with 6,656 weed samples annotated using bounding boxes. Images of crop–weed interactions were captured from a top-down perspective to allow accurate weed area estimation based on bounding box annotations. Importantly, the dataset incorporates 282 UAV-captured images, providing a large-scale aerial perspective that complements ground-based close-range details. These UAV images enhance the dataset’s diversity by introducing varying resolutions, field scales, and occlusion conditions, making it suitable for both macro-level weed distribution mapping and micro-level species identification. This multi-platform inclusion strengthens the dataset’s applicability to precision agriculture, enabling the training and evaluation of advanced computer vision models for object detection, classification, and weed–crop discrimination under real-world field conditions.

Files

Categories

Computer Vision, Machine Learning, Weed Management, Deep Learning

Licence