SAR-CLD-2024: A Comprehensive Dataset for Cotton Leaf Disease Detection

Published: 20 May 2024| Version 1 | DOI: 10.17632/b3jy2p6k8w.1


The Cotton Leaf Disease Detection Dataset represents a valuable resource for researchers, practitioners, and stakeholders in the agricultural sector, offering insights and tools to address the challenges associated with cotton leaf diseases effectively. Through accurate identification and classification of cotton leaf diseases, the dataset enables early detection, empowering farmers to take timely actions and optimize crop management strategies. Moreover, it supports the advancement of machine learning algorithms and methodologies for disease detection, fostering innovation in agricultural research. This dataset comprises meticulously curated images capturing various stages of cotton leaf diseases, sourced from the National Cotton Research Institute field in Gazipur. These images, captured using a Redmi Note 11s smartphone, represent a diverse range of disease manifestations across different dimensions. Despite challenges like fluctuating lighting conditions, field surveys conducted between October 2023 and January 2024, guided by domain experts, ensured high-quality image acquisition. The dataset contains 2137 images divided into seven classes, covering various cotton leaf conditions like bacterial blight, curl virus, and healthy leaves. These classes represent different issues affecting cotton plants, including diseases, pests, and environmental stress. Additionally, the dataset undergoes thorough preparation and augmentation procedures, including data cleaning, labeling, and augmentation techniques such as flipping and brightening. As a result, it comprises 2137 original images and 7000 augmented images, enhancing the effectiveness of deep learning models for precise classification and diagnosis of cotton leaf diseases. 1. Original (Cotton Leaf Disease Detection) Dataset: Number of datasets: 2137 Data format: .jpg 2. Augmented (Cotton Leaf Disease Detection) Dataset: Number of datasets: 7000 Data format: .jpg


Steps to reproduce

Data was collected by visiting the cotton field which is the National Cotton Research Institute field in Gazipur, Dhaka, Bangladesh. Field surveys conducted from October 2023 to January 2024, supervised by experts, ensured meticulous image capture under different environmental conditions. The dataset comprises a collection of 2200 images depicting various stages of cotton leaf diseases. These images are categorized into eight classes: bacterial blight, curl virus, herbicide growth damage, leaf hopper jasisds, leaf reddening, leaf variegation, and healthy leaves. Each class represents specific manifestations of diseases, pests, or environmental stress in cotton plants. This dataset provides a comprehensive range of visual traits essential for training


Daffodil International University


Agricultural Science, Artificial Intelligence, Computer Vision, Machine Learning, Deep Learning