FLIPPO_LO*TZ
Description
Resultant data reported in Feudal Independent Leader Proximal Policy Optimization by Austin Starken and Sean Mondesire. The paper answers the research question, "To what extent does a feudal hierarchy enhance independent PPO agents’ performance, scalability, and generalizability in a high-dimensional environment with sparse and delayed rewards compared to non-hierarchical methods?" The data files contain performance data for the approaches studied in the paper. Performance data was collected at 5 million training steps, 9 million training steps, and a final collection when training was complete. The total required training steps for each approach was collected in COMBINED_TS.xlsx
Files
Steps to reproduce
Data was generated during a study comparing Multi-agent deep reinforcement learning approaches using Python.