Published: 26 March 2021| Version 1 | DOI: 10.17632/t2r6rszp5c.1


The aim of creating JMuBEN dataset is to add to the limited availability of Arabica leaf images that are available online. This dataset will help researchers to be able to test the accuracy of their deep learning models without spending much time going to the field for data collection. The Arabica dataset (JMuBEN) contains images that are useful in training and validation during the utilization of deep learning algorithms used in plant disease recognition and classification. The dataset contains leaf images which were collected from Arabica coffee type and it shows three sets of unhealthy images. In total, 22591 images of Arabica coffee are included in JMuBEN dataset. The data has been cropped to emphasize the region of interest where the disease is seen to reduce the training time by avoiding background learning. The images were also resized to maintain uniformity in shape and size. The images that were smaller were augmented with the aim of preventing over-fitting during training and validation of models. There are a total of 3 files having images of coffee leaves affected by Coffee Rust, Cescospora and Phoma .


Steps to reproduce

Datasets were taken from Arabica coffee plantation using a camera and with the help of a plant pathologist. The images were then cropped to focus on the region of interest . Image augmentation was done with the aim of increasing the dataset size and preventing over-fitting problems during model training and validation. The images are in different folders containg annotated images of healthy, Phoma, Rust, Cescospora and Miner


Chuka University, University of Embu, Jomo Kenyatta University of Agriculture and Technology


Machine Learning, Agricultural Development, Deep Learning