Lentil Plant Disease Image Dataset (4 Class)
Description
The Lentil Plant Disease Image Dataset is a meticulously collected, organized, augmented, and preprocessed collection of high-resolution images designed to support research in plant pathology and machine learning. This dataset includes images of lentil plants affected by three common diseases—Ascochyta Blight, Lentil Rust, and Powdery Mildew—as well as healthy plants. Captured using a Samsung M31 smartphone, the images encompass various growth stages and lighting conditions, ensuring a comprehensive representation of symptoms. Data augmentation techniques, including rotation, width/height shift, shear, zoom, horizontal flipping, and brightness adjustment, have been applied to enhance the dataset's variability and robustness. The dataset is divided into training, testing, and validation sets with an 80-10-10 split, and it is meticulously labeled and organized. This makes it an invaluable resource for researchers and practitioners in the fields of computer science, artificial intelligence, computer vision, machine learning, deep learning, and agriculture.
Files
Steps to reproduce
1. Dataset Collection Define Objectives: Determine the types of diseases and conditions to be included: Ascochyta Blight, Lentil Rust, Powdery Mildew, and Normal (healthy plants). Select Locations: Choose diverse farms and research stations in Barishal, Bangladesh, to ensure a broad representation of conditions and stages. Equipment Setup: Use a Samsung M31 smartphone for capturing images. Ensure the camera settings are optimized for high-resolution images (64MP sensor, f/1.8 aperture). Image Capture: Timing: Collect images between May and August 2023. Conditions: Photograph lentil plants in various growth stages and under different lighting conditions to capture a wide range of symptoms. Resolution: Ensure all images are captured at a resolution of 224×224 pixels. Labeling: Label each image according to its class: Ascochyta Blight, Lentil Rust, Powdery Mildew, or Normal. Storage: Store images in JPEG format to maintain quality and consistency. 2. Data Augmentation Set Up Augmentation Tools: Use a data augmentation library or tool (e.g., TensorFlow ImageDataGenerator, OpenCV) to apply transformations. Define Augmentation Parameters: Rotation: Apply random rotations within a 40-degree range. Width/Height Shift: Use a width and height shift range of 0.2. Shear: Apply a shear range of 0.2. Zoom: Apply a zoom range of 0.2. Horizontal Flipping: Enable horizontal flipping. Brightness: Adjust brightness within a range of 0.5 to 1.5. Fill Mode: Use 'nearest' fill mode to handle image modifications. Generate Augmented Images: Apply the above augmentation techniques to the original images to create new variations. Ensure that the augmented images are paired with the appropriate original sample images for consistency. Organize and Store: Save augmented images in the same format and resolution as the original images. Maintain a clear structure for the dataset, including separate folders for each class. 3. Dataset Preparation Dataset Split: Divide the dataset into training (80%), testing (10%), and validation (10%) sets. Ensure that each set includes a balanced representation of all classes. Documentation: Create a README file detailing the dataset structure, collection methods, augmentation procedures, and any relevant information. Include information on image resolution, format, and class labeling. Review and Quality Check: Verify the accuracy of class labels and the quality of images. Ensure that the dataset is well-organized and all files are correctly labeled. Publish: Upload the dataset to a repository (e.g., Mendeley Data) with a clear description, licensing information, and access details.