Data from Neural Network Training in the Obstacle Tower Environment to Investigate Embodied, Weakly Supervised Learning

Published: 05-12-2020| Version 2 | DOI: 10.17632/zdh4d5ws2z.2
Viviane Clay


This repository presents data collected to investigate the role of embodiment and supervision in learning. This is done inside a simulated 3D maze world with a navigation task using mainly visual input in the form of RGB images. The main contribution of this data repository is to provide a network model trained in this environment with weak supervision and a closed loop between action and perception. Additionally, control networks are provided which were trained with varying degrees of supervision and embodiment. In the corresponding paper [1] the representations of these networks are compared based on sparsity measures and well as content of the encodings and the possibility to extract semantic labels. For the training of the control conditions several new data sets were created which are also included here. They contain a collection of images from the simulated world with corresponding semantic labels (hand labeled). Overall, they provide a good basis for further analysis and a more in-depth investigation of representation learning and the effect of embodiment and supervision on representations.


Steps to reproduce

Data was generated through a 3D simulation of a maze environment called Obstacle Tower. The data of interest are the trained neural network weights and the networks activations corresponding with different input frames. Three main networks were trained. A reinforcement learning agent which trained through interaction with the simulated environment, an autoencoder trained to reconstruct images collected by the agent and a classifier, trained to classify objects in the images. Exact training and testing conditions, hyperparameter and network structure are provided in the corresponding paper. For the training of the reinforcement learning agent the Unity ml-agents toolkit PPO implementation is used with small modifications for extra data collection and control experiments. The code we used can be found here: Model checkpoint files are saved for different points in training but mostly the final version of the network is analysed in the corresponding paper [1]. The autoencoder and classifier are trained using Python with TensorFlow and Keras. The corresponding code can be found here: The data also contains activations in the hidden layer of the network corresponding to 4000 test images for all three networks. Code for this can be found in the same GitHub repository. The datasets used for training the autoencoder and classifier were created by collecting observations in the Obstacle Tower environment using the trained agent. These observations were then labelled automatically, and the labels were cross checked by hand. A Description of the individual files is included in the data folder (Description.txt). Due to storage constraints not all model checkpoint files used to create figure 6 of the paper could be uploaded. However, feel free to contact me (vkakerbeck[at] if you are intrested in these detailed checkpoint files of the controll runs and I will make them available to you. [1] Clay, V., König, P., Kühnberger, K.-U. & Pipa, G. Learning sparse and meaningful representations through embodiment. Neural Networks (2020).