Multispectral images of apples for ripeness, sweetness and variety grading [Data set]
Description
This dataset originates from an extensive research effort focused on the development and application of a cost-effective, custom-built multi-spectral imaging chamber designed for evaluating the quality of apples. The imaging setup is capable of capturing spectral reflectance across a range of wavelength bands, enabling detailed, non-invasive analysis of fruit characteristics such as ripeness, sweetness, and varietal differences. In this study, a total of 8 wavelengths were considered. All images were collected under uniform and controlled lighting to minimize environmental variability and enhance data consistency. For Grading by Ripeness, three classes were considered: Under-Ripe, Ripe and Over-Ripe. For Grading by Ripeness, the following five types of apples were considered, each of four in quantity. 1. Red Delicious (USA) 2. Royal Gala 3. Red Delicious (New Zealand) 4. Washington 5. Kinnaur For Grading by Variety, three classes- Red Delicious USA, Alpita and Royal Gala were considered. For Grading by Variety for all 3 types, each of the seven in quantity was taken. For Grading by Sweetness, four classes were considered according to sugar content in % Brix in apples: 10, 12, 13, and 15 classes. For Grading by Sweetness, five varieties were considered, each of four in quantity. The following types of apples were considered for Grading by Sweetness. 1. Red Delicious (USA) 2. Royal Gala 3. Red Delicious (New Zealand) 4. Washington 5. Kinnaur The images are concatenated with the help of MATLAB code and the concatenated dataset is created for grading by sweetness, ripeness and variety. For this study, APPLENET, a CNN-based architecture, was used to process the concatenated images, and the accuracy achieved was 87 %,65 % and 92 % for grading by ripeness, sweetness and variety, respectively. The dataset is labelled and structured to support a wide range of applications, particularly in the domains of agricultural technology and food quality monitoring. It offers possible use cases for developing classification and regression models for fruit grading, maturity evaluation, and early detection of surface-level defects. Researchers working on agricultural automation, deep learning in food quality inspection, or horticulture may find this dataset particularly valuable. Beyond apples, the methodology used for image acquisition and data annotation can be adapted for other fruits, offering scalability for broader agricultural research. The detailed documentation and consistent imaging protocol enhance reproducibility, making this dataset a useful benchmark for relative studies. This data collection contributes meaningfully to ongoing efforts in computer vision and AI-powered agriculture by providing a reliable, annotated source of multi-spectral fruit images for non-destructive quality evaluation.
Files
Steps to reproduce
Data Acquisition Methodology To develop a reliable dataset for grading apples by sweetness, ripeness, and variety, a systematic and reproducible data collection process was implemented using a custom-built multispectral imaging chamber and standardized protocols. The chamber was designed as a low-cost, flexible setup for capturing high-resolution spectral images under controlled lighting. Instrumentation and Setup The imaging system was built using a Logitech HD Portable 1080p Webcam (C615) with autofocus. Spectral illumination was provided by an RGB LED Soft Ring Light (MJ26) supporting eight wavelengths. These were mounted inside a wooden enclosure (30×45×45 cm) with a front door to block ambient light and reduce reflections. A standard computer controlled the setup and processed the images. Sample Collection and Preparation Apple samples were collected from local farms and markets across Maharashtra, India. The dataset includes cultivars such as Red Delicious (USA), Red Delicious (New Zealand), Royal Gala, Washington, Kinnaur, and Alpita. Each apple was gently cleaned to remove debris or wax and left at room temperature before imaging. Image Capture Protocol Each apple was imaged under narrowband light from three angles—top, side, and bottom. At least nine images, including a black reference, were captured per sample using different wavelength settings. All images were saved in JPEG format for consistency. Sweetness Measurement (Brix Value) To establish ground truth for sweetness, a handheld digital refractometer (ERMA INC) was used to measure the % Brix of juice extracted from each apple. Juice was obtained by cutting and manually squeezing the inner pulp. Brix values were recorded and linked with the corresponding image set for supervised learning. Ripeness Grading Ripeness levels were determined using visual inspection (color, firmness, texture), Brix readings, and expert input from local horticulturists. Apples were classified as unripe, ripe, or overripe, with cross-validation among observers to reduce subjectivity. Data Labeling and Organization Each image file was named using a structured format that included apple variety, ripeness level, Brix value, and wavelength. Software and Workflow MATLAB was used for concatenating images. Image preprocessing—such as resizing, background removal, and contrast adjustment—was done using Python and OpenCV. Data was organized into labeled folders for machine learning. Scripts used for capture and processing are available upon request