Synthetic Simulation Data of Wind Power Time Series in Germany
Description
This dataset contains synthetic wind power generation time series for 160 locations in Germany. The data is provided at an hourly resolution covering a two-year period. It was generated using an open-source, reproducible simulation framework described in the publication "Synthetic Data Simulation Framework for Wind Power Time Series". To address different modeling needs, the dataset is structured into three main folders: era5_wind_hourly_age: Synthetic data based exclusively on state-of-the-art ERA5 numerical reanalysis data. dwd_era5_wind_hourly_age: Synthetic data based on local DWD weather station measurements. Wind speed extrapolation utilizes dynamic vertical wind shear calculated from ERA5 multi-level data between 10 and 100 m. icon_d2_nwp_time_series: Accompanying numerical weather prediction (NWP) data from the ICON-D2 model. This features a 48-hour forecast horizon from 6, 9, 12, and 15 UTC runs. Key features of the simulation framework include: Hub Height Air Density: Air density is dynamically calculated using an extended barometric formula. This avoids the inaccurate assumption of a constant standard air density. Aging Effects: Power coefficients (Cp) are dynamically adjusted based on turbine age. This uses an annual degradation rate (ADR) of 0.63%. It applies a real-world age distribution from the German market data register (MaStR). Physics-Informed Simulation: The final output is calculated using the physical power equation. This applies the extrapolated meteorological values and the adjusted Cp. The time series files contain both the meteorological variables at ground/hub height and the synthetic power generation for six common turbine configurations: Enercon E-70 E4 (57m), E-82 E2 (138m), E-115 (149m), Vestas V90 (95m), V112-3.45 (119m), V80-1.8 (78m). Metadata tables detailing turbine specifications and site parameters are also included. The framework was comprehensively validated against Renewables.ninja and operational data from 13 real wind parks across nine German states. Results prove that the ERA5-driven approach and the integration of turbine aging significantly improve long-term accuracy, particularly for older fleets. This dataset is highly suitable for benchmarking, developing machine learning forecasting models, grid planning, and energy system analysis.
Files
Steps to reproduce
The synthetic data was generated using a custom, open-source simulation framework, which is publicly available on GitHub and detailed in Section 3 of the accompanying paper. The workflow integrates two distinct simulation pathways (reanalysis-driven and measurement-driven) using several data sources and physical models. Data Acquisition (Inputs): Meteorological Data: 10-minute observations (wind speed, wind direction, std. deviation of horizontal wind, pressure, temperature, relative humidity) for 160 DWD (German Weather Service) stations were sourced from the DWD Climate Data Center and aggregated to hourly resolution. Additionally, ERA5 reanalysis data (including 10m and 100m wind speeds, friction velocity, dew point, and surface pressure) were acquired. Forecast Data: Accompanying numerical weather prediction (NWP) data from the DWD's ICON-D2 model was extracted to provide a 48-hour forecast horizon for all locations. Technical Parameters: Turbine specifications (hub height, rotor diameter, power curves) were obtained from the 'wind-turbine-models.com' database. Age Distribution: An empirical age distribution was derived from the German Core Energy Market Data Register (MaStR) to realistically sample turbine commissioning dates. Simulation Workflow (Protocol): The framework executes the following physics-informed protocol for each timestamp at each location: Air Density Calculation: Air density at 2m is first calculated from measured pressure, temperature, and relative humidity (or derived via dew point for ERA5). This value is then extrapolated to the turbine's specific hub height using the extended barometric formula. Wind Speed Extrapolation:For the reanalysis approach (era5_wind_hourly_age), the Hellmann exponent (shear) is calculated exactly and dynamically using wind speeds at 10m and 100m pressure levels from ERA5. For the measurement-based approach (dwd_era5_wind_hourly_age), the measured 10m DWD wind speed is extrapolated using a turbulence-based estimation. Crucially, the necessary friction velocity for this calculation is sourced directly from ERA5 reanalysis data. Aging Degradation: A degradation factor (DF) is calculated based on an Annual Degradation Rate (ADR) of 0.63% and the turbine's sampled age. This DF is used to compute an "aged" power coefficient (Cp_aged) by dynamically adjusting the manufacturer's ideal Cp-curve.Power Synthesis: The final synthetic power is calculated using the physical formula P=0.5⋅ρ_hub⋅A⋅v_hub^3⋅Cp_aged, integrating the extrapolated hub-height air density (ρ_hub), extrapolated hub-height wind speed (v_hub), and the aged power coefficient. This workflow was repeated for all 160 locations over a two-year period, simulating six distinct turbine models at each location to generate the final dataset.
Institutions
- Hochschule Karlsruhe Technik und WirtschaftBaden-Württemberg, Karlsruhe