Lightweight Dataset for Maize Classification on Resource-Constrained Devices

Published: 3 September 2023| Version 2 | DOI: 10.17632/r6vvm5jkh6.2
Emmanuel Asante,


The dataset comprises images of three types of maize seeds: Wang Dataa, Sanzal Sima, and Bihilifa, sourced from Heritage Seeds Ghana. These maize varieties are commonly cultivated in the northern region of Ghana. At the collection point, Heritage Seeds Ghana manually sorted and labeled the images as either 'good' or 'bad' for each of the three varieties. The 'good' category represented high-quality maize seeds suitable for productive yields, while the 'bad' category included damaged, infected, or low-quality seeds not suitable for production. The images were captured using a 12-megapixel phone camera, resulting in original JPEG images of varying dimensions. Additionally, the augmented images were standardized to a size of 128 by 128. During capture, a blue background was used to ensure consistency and clarity during daylight, with no specific attention to lighting conditions. The images were then organized into their respective classes of 'good' and 'bad.' Overall, the dataset for this study comprises both raw (4,846 images) and augmented (28,910 images) color images.


Steps to reproduce

Hyperspectral imaging, combined with deep learning techniques, has been employed to classify maize. However, the implementation of these automated methods often requires substantial processing and computing resources, presenting a significant challenge for deployment on embedded devices due to high GPU power consumption. Access to Ghanaian local maize data for such classification tasks is also extremely difficult in Ghana. To address these challenges, this research aims to create a simple dataset comprising three distinct types of local maize seeds in Ghana. The goal is to facilitate the development of an efficient maize classification tool that minimizes computational costs and reduces human involvement in the process of grading seeds for marketing and production. The dataset is presented in two parts: raw images, totaling 4,846 files, are categorized into bad and good. Specifically, 2,211 files belong to the bad class, while 2,635 belong to the good class. Augmented data consists of a total of 28,910 images, with 13,250 representing bad data and 15,660 representing good data. All images have been validated by experts from Heritage Seeds Ghana and are freely available for use within the research community.


University of Energy and Natural Resources


Machine Learning, Machine Learning Algorithm, Image Classification, Seed, Maize, Precision Agriculture, Convolutional Neural Network, Deep Learning, Neural Network