Time Series Dataset for Modeling and Forecasting of N2O in Wastewater Treatment

Published: 9 July 2024| Version 2 | DOI: 10.17632/xmbxhscgpr.2
Laura Debel Hansen, Anju Rani, Daniel Ortiz Arroyo, Petar Durdevic


This dataset presents two years of high-resolution nitrous oxide (N2O) measurements for time series modeling and forecasting in wastewater treatment plants (WWTP). The dataset comprises frequent, real-time measurements from a full-scale WWTP, with a sample interval of 2 minutes, making it ideal for developing models for real-time operation and control. This comprehensive bio-chemical dataset includes detailed influent and effluent parameters, operational conditions, and environmental factors. Unlike existing datasets, it addresses the unique challenges of modeling N2O, a potent greenhouse gas, providing a valuable resource for researchers to enhance predictive accuracy and control strategies in wastewater treatment processes. Additionally, this dataset significantly contributes to the fields of machine learning and deep learning time series forecasting by serving as a benchmark that mirrors the complexities of real-world processes, thus facilitating advancements in these domains. We provide a detailed description of the dataset along with a statistical analysis to highlight its characteristics, such as nonstationarity, nonnormality, seasonality, heteroscedasticity, structural breaks, asymmetric distributions, and intermittency, which are common in many real-world time series datasets and pose challenges for forecasting models.



Aalborg Universitet


Environmental Science, Environmental Engineering, Time Series, Wastewater, Nitrous Oxide, Deep Learning, Digital Twin, Data-Driven Learning