Optuna Tuning Results PPO Reinforcement Learning Hyperparameters Performance

Published: 16 October 2024| Version 1 | DOI: 10.17632/sjp82gkxgz.1
Contributors:
Abdelkader Messlem,
,
,

Description

Systematic hyperparameter tuning using Optuna was expected to improve PPO model performance in a multi-microgrid environment. We hypothesized that optimizing hyperparameters like learning rate and network architecture would enhance model performance, reflected in increased mean reward and training stability. Data Overview: Dataset: Results from hyperparameter tuning of a PPO model in a multi-microgrid environment Contents: Hyperparameter settings and performance metrics. Data Collection Process: Sampling: Hyperparameters sampled by Optuna and tested by training PPO for 500,000 timesteps Use the command tensorboard --logdir=./Logs/PPO_1 to visualize the data with TensorBoard.

Files

Steps to reproduce

Objective: The primary goal was to optimize the hyperparameters of a Proximal Policy Optimization (PPO) model for reinforcement learning within a multi-microgrid environment. The aim was to identify optimal settings to enhance the model's performance in controlling battery energy storage systems across multiple microgrids. Experimental Design: Optimization Tool: Optuna Algorithm: Proximal Policy Optimization (PPO) Environment: Custom multi-microgrid simulation Setup and Tools: Optuna: For hyperparameter tuning Stable-Baselines3: For PPO implementations Python: For scripting and data management Custom Environment: Simulates battery storage systems in multi-microgrids Protocols and Methods: Hyperparameter Sampling: Optuna’s TPESampler for various settings Training: PPO model trained for 500,000 timesteps Evaluation: Mean reward recorded, with pruning based on early performance indicators Data Storage: CSV file for analysis and TensorBoard to visualize the data Use the command tensorboard --logdir=./Logs/PPO_1 to visualize the data with TensorBoard.

Institutions

Universite Ibn Khaldoun Tiaret

Categories

Machine Learning, Multi-Objective Optimization

Licence