Mango DMC and NIR spectra

Published: 15 February 2026| Version 6 | DOI: 10.17632/46htwnp833.6
Contributors:
,
,
,

Description

These datasets contain Near-infrared (NIR) absorbance spectra of the wavelength range 309-149 nm of mango mesocarp with corresponding Dry Matter Content (DMC) values. The intent of publishing this dataset is to provide a consistent benchmark for researchers to compare modelling techniques against published results using the same training and test sets. In its original form, the dataset was presented as a real-world case study, where several years of data were used to develop a model and performance was then evaluated using data from the final year, explicitly accepting differences in population variance. Researchers should not combine all data and perform a random split into training and test sets. A random split would allow replicate spectra to appear in both sets, leading to overly optimistic performance estimates. As described in the associated publications: “Each fruit was scanned TWICE on the widest section of each cheek (approximately the middle of the fruit), orthogonal to the endocarp plane. This scan location represents fruit tissue that approximates the average mesocarp composition (data not shown). A subset of 744 samples was scanned at THREE fruit temperatures (approximately 15, 25, and 35 °C).” If researchers choose to define their own training and test splits, this must be clearly stated and the resulting outcomes should not be compared directly with those reported in the original publications. The file "MangoDMC_NIR_Data_v3.csv" contains data as used in the publication "Achieving robustness across season, location and cultivar for a NIRS model for intact mango fruit dry matter content" (Postharvest Biology and Technology, 2020, 168:111202; https://www.sciencedirect.com/science/article/pii/S0925521420301629), with addition of data from an additional harvest season, as used in the publication "Evaluation of 1D Convolutional Neural Network in Estimation of Mango Dry Matter Content" (Spectrochimica Acta Part A 2024 311: 124003; https://www.sciencedirect.com/science/article/pii/S1386142524001690). The current version (v4) has an additional file ".csv". This file augments the data of version 3 with data from additional instruments and seasons as used in the submitted thesis of Jeremy Walsh, 2024, Central Queensland University, "Deep Learning in Estimation of Fruit Attributes Using Near Infrared Spectroscopy".

Files

Steps to reproduce

Follow the steps listed in the materials and methods in the companion paper.

Institutions

Categories

Near Infrared Spectroscopy, Absorbance, Mango

Licence