Spectral data of tropical soils using dry-chemistry techniques (VNIR, XRF, and LIBS): a dataset for soil fertility prediction

Published: 22 February 2022| Version 2 | DOI: 10.17632/88c5kvmgbf.2
Tiago Rodrigues Tavares,


The dataset contains spectral data and characterizations of key soil fertility attributes of 102 soil samples. These samples are from two Brazilian agricultural areas, which have soils classified as Lixisol (Field 1) and Ferralsol (Field 2). Both type of soils are commonly found in Brazil’s tropical regions. The chosen fields have different soil matrices due to considerable textural and total elemental composition contrast. Regarding the fertility attributes, they present wide ranges of the variability of fertility attributes. After soil fertility tests, the samples were scanned with the following direct analysis techniques: (i) visible and near infrared diffuse reflectance spectroscopy (VNIR), (ii) X-ray fluorescence spectroscopy (XRF), and (iii) laser-induced breakdown spectroscopy (LIBS).


Steps to reproduce

Soil samples were collected from 0 to 20 cm depth; oil fertility analyses for determining clay, organic matter (OM), cation exchange capacity (CEC), pH, base saturation (V), exchangeable (ex-) P, ex-K, ex-Ca, and ex-Mg were performed in a commercial laboratory. Loose soil samples (dry and grain size < 2mm) were scanned with the VNIR and XRF sensor. For LIBS data acquisition, the samples were pelletized after being comminuted (using a ball mill) with a binder material; Spectral data acquisition was performed under laboratory conditions after samples had been dried and sieved (≤ 2mm); The VNIR data was acquired after the spectrometer calibrates itself using reference materials with known spectral behaviour; For XRF data acquisition the X-ray tube was set for voltage and current of 35 kV and 7 μA, respectively. No vacuum condition or filters were used for the XRF spectra acquisition; For LIBS data acquisition, the following instrumental conditions were used: laser pulses with 65 mJ, 19.5 cm of lens-to-sample distance (given that 255 J cm−2 laser fluence), 15 accumulated laser pulses, 2 µs of delay time, and 7 µs of integration time gate. The shared dataset contains four tables (which were shared in both .txt and .xlsx format) named as "soil fertility data", "VNIR data", "XRF data", and "LIBS data", which respectively contain the data from the soil fertility analysis and VNIR, XRF, and LIBS spectroscopies. The tables/datasets "soil fertility data", "VNIR data", and "XRF data" are organized in dataframes with long format (i.e., observations in rows and variables in columns) and the table containing the LIBS data is a dataframe with wide format (i.e., observations in columns and variables in rows). All datasets have 102 observations and have as primary key the variable ID (first variable of all datasets), which identifies the samples (observations) with sequential numbers. The second variable of all datasets is named "Field" and contains the category “1”, for samples from Field 1 (n = 58), and the category “2”, for samples from Field 2 (n = 44). The other variables of each dataset are specified below. • "Soil fertility data": from column 3 to 11 (9 variables) are the contents of clay, OM, CEC, pH, V, ex-P, ex-K, ex-Ca, and ex-Mg, respectively. The values are given in g dm−3 for clay and OM; in mmolc dm−3 for CEC, ex-K, ex-Ca, and ex-Mg; in % for V; and, for ex-P, it was given in mg dm−3. • "VNIR data": from column 3 to 353 (351 variables) are the reflectance values (expressed in %) of wavelengths ranging from 431.59 to 2153.11 nm. • "XRF data": from column 3 to 2050 (2048 variables) are the emission intensity values (in counts of photons per second) of energies oscillating from 0.01 to 40.74 keV. • "LIBS data": from row 3 to 53719 (53717 variables) are the emission intensity values (in arbitrary unit) of wavelengths ranging from 200.01 to 779.99 nm.


Universidade de Sao Paulo


Soil Science, Spectroscopy, Nutrient Sensing, Soil Fertility