Data Enrichment for Machine Learning based Energy Prediction Model
Description
This dataset is used for training of component based machine learning models described in the linked article. The article examines the effect of enriching training data with several building shapes on the prediction accuracy of machine learning models. There are nine building shapes used to collect the training data using EnergyPlus. Please read the article for the relevant details of component structure and training of machine learning models.
Files
Steps to reproduce
1. Generate specified number of samples (500 for Shape 1 to Shape 6 and 1500 for Custom Shape 1 to Custom Shape 3) using Sobol sequences. 2. Create corresponding EnergyPlus file for each shape and storey option (Use C# programs>IDFWrite.exe with H Function, Shape 1 - one storey can be generated with keyword 1a, Shape 2 - two storey 2b, and CustomShape1 - cs1). It will create EnergyPlus input file (.idf) in the corresponding folder. 3. Run EnergyPlus simulations and keep .csv files. 4. Read EnergyPlus output and map it to corresponding building elements, zone and building. 5. Create the data file for each component - walls, windows, ground floors, roofs, infiltration, zone, building. (Use C# programs>IDFResultsRead.exe with H Function, Shape 1 - one storey can be generated with keyword 1a, Shape 2 - two storey 2b, and CustomShape1 - cs1). It will create one csv file for each component. 6. Develop machine learning model (component-based) and use these models for making energy predictions. The training process of Machine Learning models and the component structure is described in the linked article. If the machine learning models are developed using one shape only, the prediction has a prefix "RawFeatures-E1" and so on.