MA-LeafFruitDx: An Image Dataset for Deep Learning-Based Diagnosis of Mangrove Apple Diseases

Published: 27 May 2025| Version 1 | DOI: 10.17632/jpgggxz6xc.1
Contributors:
,
,
,
,

Description

The MA-LeafFruitDx (Sonneratia caseolaris) Leaf and Fruit Disease Classification Dataset is a curated image dataset designed to support advanced research in plant health monitoring using machine learning, deep learning, and computer vision. This dataset captures both healthy and diseased states of leaves and fruits of the Mangrove Apple (Sonneratia caseolaris), a plant native to the Sundarban mangrove forest in Bangladesh. Total Number of Images; Original Dataset Size: 2,307 images Augmented Dataset Size: 17,500 images (3,500 per class Ɨ 5 classes) Class-wise Image Distribution (Original Dataset) 1. Healthy Fruits: 420 2. Healthy Leaves: 494 3. Insect Hole Leaves: 410 4. Unhealthy Fruits: 477 5. Yellow Leaves: 506 Data Augmentation: To address class imbalance and improve model generalization, the dataset was augmented to include 3,500 images per class, bringing the total dataset size to 17,500 images. The following augmentation techniques were applied: - Geometric Transformations: Rotation, flipping (horizontal and vertical), random cropping - Photometric Enhancements: Brightness adjustment, contrast modification, Gaussian blur - Noise Injection: Gaussian noise and salt-and-pepper noise - Color Space Variations: HSV shifts, grayscale conversion - Perspective and Affine Transformations Image Format: All images are in .jpg format. Camera Devices Used: 1. Samsung Galaxy S23+: 1,568 2. OnePlus GM1901: 739 Geographical Location: Data Collection Site: Sundarban Mangrove Forest, Bangladesh — a UNESCO World Heritage Site and home to a wide diversity of plant and animal species. šŸš€ Application Areas This dataset has diverse applications across ML, DL, and CV research fields: āœ… 1. Machine Learning 1. Feature engineering and selection for classical ML models (e.g., SVM, Random Forest, XGBoost) 2. Disease pattern classification using handcrafted features 3. Model benchmarking for low-resource environments āœ… 2. Deep Learning 1. Training and evaluation of Convolutional Neural Networks (CNNs) 2. Transfer learning with pre-trained models (e.g., ResNet, EfficientNet, InceptionV3) 3. Fine-grained image classification and object localization 4. Use in multi-class classification and semantic segmentation tasks āœ… 3. Computer Vision 1. Development of real-time plant disease detection systems 2. Implementation in mobile applications for on-field crop monitoring 3. Use in smart agriculture systems (IoT + CV) 4. Integration with image segmentation and disease severity scoring 5. Vision transformers (ViTs) and self-supervised learning exploration This dataset provides a rich, high-quality resource for researchers aiming to advance the state-of-the-art in automated plant disease recognition, especially in mangrove ecosystems, which are often underrepresented in computer vision literature.

Files

Institutions

  • Daffodil International University

Categories

Computer Vision, Machine Learning, Image Classification, Sustainable Agriculture, Deep Learning, Agriculture

Licence