Data for: Raman and Infrared Spectroscopic Analysis: Classification and Peroxide Value Prediction of Naturally Aged Edible Oils

Published: 28 May 2024| Version 2 | DOI: 10.17632/ctgg7k4m5g.2
William Gilbraith, Chance Carter, Kristl Adams, Isio Sota-Uba, Barry Lavine, Josh Ottaway,


The provided dataset includes information on edible oil samples collected from grocery stores in and around Newark, Delaware, between the summer of 2014 and the spring of 2016. A total of 100 oil bottles were obtained, and three different data sets were created. Data Set 1 contains measurements from all 100 samples and was obtained using NIR, MIR, and Raman spectroscopic techniques. The peroxide values (PVs) of the samples were determined through titration at Lawrence Livermore National Laboratory. Data Set 1 is divided into two subgroups: Data Set 1A, measured in 2016, and Data Set 1B, measured in 2019. Data Set 2 is a subset of Data Set 1, consisting of 53 oil samples. These samples were measured using Raman spectroscopy and titrated to determine the PV at the University of Delaware. Data Set 3 is another subset of Data Set 1, comprising 356 IR spectra of 20 varieties of edible oils as well as 120 spectra of extra virgin olive oil that has been adulterated by corn oil, canola oil or almond oil. These samples were measured using ATR-FTIR spectroscopy at 4 cm^-1 resolution at Oklahoma State University. Data Set 3 includes pure oil samples as well as adulterated oil samples, specifically adulterated extra virgin olive oil (EVOO) with corn oil, canola oil, or almond oil. The measurement techniques and parameters varied for each data set. NIR spectra were acquired using FTIR spectrometers with different optical path lengths, MIR spectra were collected using a liquid nitrogen-cooled mercury cadmium telluride (MCT) detector, and Raman spectra were obtained with different Raman probes and lasers. The spectroscopic measurements were complemented with titration measurements to determine the PVs. The dataset is provided as individual csv files for each type of spectroscopy, with the first two columns capturing class and corresponding peroxide value for the spectrum and the top row capturing the wavelength range of the spectra. Note: There are a few instances where replicates were not taken or certain samples were replaced with NaN variables to maintain the proper matrix dimensions.



Analytical Chemistry, Spectroscopy, Chemometrics


National Science Foundation


Lawrence Livermore National Laboratory


National Science Foundation


National Science Foundation


National Science Foundation