Abstract:When robots interact with humans in homes, roads, or factories the human's behavior often changes in response to the robot. Non-stationary humans are challenging for robot learners: actions the robot has learned to coordinate with the original human may fail after the human adapts to the robot. In this paper we introduce an algorithmic formalism that enables robots (i.e., ego agents) to co-adapt alongside dynamic humans (i.e., other agents) using only the robot's low-level states, actions, and rewards. A core challenge is that humans not only react to the robot's behavior, but the way in which humans react inevitably changes both over time and between users. To deal with this challenge, our insight is that -- instead of building an exact model of the human -- robots can learn and reason over high-level representations of the human's policy and policy dynamics. Applying this insight we develop RILI: Robustly Influencing Latent Intent. RILI first embeds low-level robot observations into predictions of the human's latent strategy and strategy dynamics. Next, RILI harnesses these predictions to select actions that influence the adaptive human towards advantageous, high reward behaviors over repeated interactions. We demonstrate that -- given RILI's measured performance with users sampled from an underlying distribution -- we can probabilistically bound RILI's expected performance across new humans sampled from the same distribution. Our simulated experiments compare RILI to state-of-the-art representation and reinforcement learning baselines, and show that RILI better learns to coordinate with imperfect, noisy, and time-varying agents. Finally, we conduct two user studies where RILI co-adapts alongside actual humans in a game of tag and a tower-building task. See videos of our user studies here: <a class="link-external link-https" href="https://youtu.be/WYGO5amDXbQ" rel="external noopener nofollow">this https URL</a>

Co-Imitation: Learning Design and Behaviour by Imitation

Motion Control of Bionic Robots Via Biomimetic Learning

Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning

Imitating by Generating: Deep Generative Models for Imitation of Interactive Tasks

Robotic Imitation of Human Actions

Imitation and Adaptation Based on Consistency: A Quadruped Robot Imitates Animals from Videos Using Deep Reinforcement Learning

Learning Latent Representations to Co-Adapt to Humans

An Algorithmic Perspective on Imitation Learning

Sample-efficient Adversarial Imitation Learning from Observation

Identifying Interaction Patterns of Tangible Co-Adaptations in Human-Robot Team Behaviors

Bi-Level Motion Imitation for Humanoid Robots

MimicPlay: Long-Horizon Imitation Learning by Watching Human Play

Becoming Team Members: Identifying Interaction Patterns of Mutual Adaptation for Human-Robot Co-Learning

HumanMimic: Learning Natural Locomotion and Transitions for Humanoid Robot via Wasserstein Adversarial Imitation

State-only Imitation with Transition Dynamics Mismatch

Selective imitation on the basis of reward function similarity

Whole-body humanoid robot imitation with pose similarity evaluation

Towards Learning to Imitate from a Single Video Demonstration

Imitation Learning via Simultaneous Optimization of Policies and Auxiliary Trajectories

Metric-Based Imitation Learning Between Two Dissimilar Anthropomorphic Robotic Arms

One-Shot Imitation under Mismatched Execution