Abstract:Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms. To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks. Using MART, we collected demonstrations for five novel two and three-arm tasks from several geographically separated users. From our data we arrived at a critical insight: most multi-arm tasks do not require global coordination throughout its full duration, but only during specific moments. We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously, and perform a comprehensive study of different policy architectures with varying levels of centralization on our tasks. Finally, we propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks. Additional results and videos at <a class="link-external link-https" href="https://roboturk.stanford.edu/multiarm" rel="external noopener nofollow">this https URL</a> .

Inferring and Learning Multi-Robot Policies by Observing an Expert

Generalize Robot Learning from Demonstration to Variant Scenarios with Evolutionary Policy Gradient

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

Human-in-the-Loop Imitation Learning using Remote Teleoperation

Adaptive Robot Assistance: Expertise and Influence in Multi-User Task Planning

Enabling Multi-Robot Collaboration from Single-Human Guidance

Imitation Learning via Simultaneous Optimization of Policies and Auxiliary Trajectories

Training Robots without Robots: Deep Imitation Learning for Master-to-Robot Policy Transfer

Robot Policy Improvement With Natural Evolution Strategies for Stable Nonlinear Dynamical System

Learning One-Shot Imitation From Humans Without Humans

Learning Exploration Strategies to Solve Real-World Marble Runs

Efficient Robot Skill Learning with Imitation from a Single Video for Contact-Rich Fabric Manipulation

XAI-N: Sensor-based Robot Navigation using Expert Policies and Decision Trees

Scaling Cross-Embodied Learning: One Policy for Manipulation, Navigation, Locomotion and Aviation

Learning When to Ask for Help: Efficient Interactive Navigation via Implicit Uncertainty Estimation

SafeAPT: Safe Simulation-to-Real Robot Learning Using Diverse Policies Learned in Simulation

Generalization of Heterogeneous Multi-Robot Policies via Awareness and Communication of Capabilities

Multi-Task Policy Search

Multi-Robot Informative Path Planning for Efficient Target Mapping using Deep Reinforcement Learning

Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning