Mutual Adaptation for Human-Robot Co-Learning - USAR task

Published: 21 June 2021| Version 1 | DOI: 10.17632/r2y8z6bzg8.1
Emma van Zoelen


This data was gathered in an experiment with the aim of studying mutual adaptations in a human-robot collaborative task. Human participants were presented with a virtual (Urban-Search-and-Rescue inspired) task and robot. They were given limited information about the behavior of the robot, which was driven by a Reinforcement Learning algorithm. The goal for the human-robot team was to learn how successfully collaborate on this task over the course of eight rounds of performing the task. We wanted to gain insight into the following: - What kind of interaction patterns emerged as a result of the mutual adaptivity of the human and the robot; - What behavior the robot learned as a result of this; - What behavior the human learned as a result of this; - How the learning process of the human might have influenced the learning process of the robot. To achieve this, we include a file that summarizes the behavior and interactions of the human-robot team in a qualitative manner. Important to note here is that participants were asked to think aloud, giving us more details about their behavior. Moreover, we include an excel file that contains the following data: - Robot Macro-Actions: the robot used a Q-learning algorithm to choose between 3 macro-actions in each of 4 phases of the task. This file shows which macro-action had the highest Q-value for each phase at the end of each round, for each participant. - Human Clusters: we clustered the human participants into three large clusters based on their behavior. - Human Strategies: we coded the behavior of the participants for each round of the task. - Collaboration Fluency Scores: we measured subjective collaboration fluency through 3 short questions asked to each participant after each round in the experiment. Our most important findings from this data were a set of interaction patterns, as well as the suggestion that the level of adaptivity of the human influenced which macro-actions the robot chose to use. The details of the research for which this data was collected and the experiment can be found here:


Steps to reproduce

The code necessary to run the task used to gather this data can be found here: A detailed description of the experimental protocol can be found here:


TNO Locatie Soesterberg


Reinforcement Learning, Questionnaire, Observation