Disease Dataset of Wheat: Original, Augmented, and Balanced for Deep Learning
Description
The dataset originated from a wheat crop field in Bangladesh where five separate wheat leaf image categories exist. The dataset contains 1,603 original images with 1920 × 1080 pixels resolution and is separated into five different disease categories consisting of Black Point (303 images), Fusarium Foot Rot (250 images), Healthy Leaf (250 images), Leaf Blight (400 images), and Wheat Blast (400 images). The application of data augmentation techniques produced 1,000 additional images per class to balance the dataset before creating the augmented dataset. After data augmentation, the total number of images across the 5,000 dataset represents an equal distribution of disease categories. The machine learning model needs training, so the augmented dataset split into training (70%) and testing (20%) and validation (10%) portions to help evaluation. The structured splitting technique enables effective generalization of models while ensuring the best results in multiple experimental testing conditions. The dataset follows a system of three main directories: 1) Original Dataset: Contains raw images captured directly from the field. 2) Augmented Dataset: A separate section in the database features synthetic images that aid distribution balance. 3) Split Dataset: The Split Dataset holds pre-processed divisions of training data, testing data, and validation data that stem from the augmented dataset. The wheat disease dataset provides researchers with essential resources to conduct investigations about wheat disease categorization as well as agricultural AI development and deep learning-based plant disease identification studies.
Files
Steps to reproduce
Data Augmentation Procedure: The research applied different data augmentation techniques for improving both the quantity and stability of learning models. Diversity transformation in the dataset occurs through augmentation techniques to produce better model generalization during unknown new data processing. Augmentation Techniques Applied: 1. Geometric Transformations: 1) Rotation: Random angles (e.g., 15°, 30°) for perspective variation. 2) Flipping: Horizontal & vertical flips for orientation diversity. 3) Scaling: Enlarged (1.1x, 1.3x) while maintaining dimensions. 4) Shifting: Translated (±10 pixels) in x & y directions. 5) Center Cropping: Cropped (80% or 60%) & resized. 2. Color & Intensity Modifications: 1) Brightness: Adjusted (0.7x to 1.3x). 2) Contrast: Modified for varied lighting. 3) Saturation: Altered to simulate environmental changes. Augmentations create a learning environment which makes models robust across different conditions to achieve better real-world adaptation. The training process requires enriched data obtained through these augmentations before proceeding.