Data, R Scripts and Random Forest Models for Winter Catch Crop Monitoring from Sentinel-2 NDVI Time Series in Germany

Published: 27 November 2023| Version 3 | DOI: 10.17632/78g2r5dp3k.3


The data contains a zip-file with the following folders: a) data (agricultural parcels, filled and unfilled NDVI time series tables, feature extraction tables and prediction results) (csv, shp), b) model (random forest models for catch crop prediction) (rds), and c) R (R script files for Random Forest model training and prediction with RStudio) (r). The algorithms and models developed for this study were implemented via virtual Docker containers into the timeStamp software prototype which allows for large-scale automatized catch crop analysis on the parcel-level ( timeStamp saves Sentinel-2 raster data as parcel-wise clipped image time series into a PostGIS database. All further processing steps were performed with the statistical computing language R (RStudio Team, 2020). For raster data manipulation within the PostGIS database and downloading NDVI time series, we used the packages rpostgis (Bucklin and Basille, 2019) and RPostgreSQL (Conway et al., 2017). For time series filling and predictors calculation, we used the packages zoo (Zeileis et al., 2020), hydroGOF (Zambrano-Bigiarini, 2020), tsoutliers (de Lacalle, 2019), and changepoint (Killick et al., 2016). For RF modelling, we used the package caret (Kuhn et al., 2020). The original data for NDVI time series calculation is from the GFZ Time Series System for Sentinel-2 by the German Research Centre for Geosciences, 2020 ( The predictors for Random Forest modelling calculated from the NDVI time series are described in the article in the reference section. For further information, we refer to the following article: Schulz, C.; Holtgrave, A.; Kleinschmit, B.: Large-scale winter catch crop monitoring with Sentinel-2 time series and machine learning–An alternative to on-site controls?, Computers and Electronics in Agriculture, Volume 186, 2021, 106173, ISSN 0168-1699,


Steps to reproduce

1. Extract the Zip-folder and 2. open the R scripts. (For model training and prediction RStudio and R 3.6.1 with the packages "tidyverse", "rlist", "lubridate", "caret", "xgboost", "ipred", "doSNOW", "snowfall", "randomForest", "e1071", "data.table" need to be installed. Please change the working directory at the beginning of each script with a link to the unzipped folder.)


Technische Universitat Berlin


Remote Sensing, Machine Learning, Environmental Modeling, Time Series, Compliance Monitoring, Cover Crop, Agricultural Diversification, Random Decision Forest