Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals
Description
The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp. The dataset construction required thorough preprocessing, which involved removing erroneous (NaN) values and outliers, identifying missing data, detecting inconsistencies in date records, filling gaps with averaged data, conducting visual inspections through graphs, and exporting the final cleaned dataset. The preprocessing stage also ensured the data's integrity, accuracy, and consistency, which are critical for reliable analysis and modeling.
Files
Steps to reproduce
Data processing and quality control were performed using Microsoft Excel, Python with Pandas library, and MATLAB scripts. The code for visualization and data pre-processing is available in: https://github.com/johnbarco/Irradiance_dataset_2013_2023. Python libraries utilized include Pandas for data manipulation and handling, NumPy for numerical operations, Matplotlib for data visualization, and Seaborn for enhanced plots and color palettes. [visualizing_irradiance_3d.py] was used to generate Figure 1, and [correct_missing_data.py] to complete missing data.