Abstract:Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms. To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks. Using MART, we collected demonstrations for five novel two and three-arm tasks from several geographically separated users. From our data we arrived at a critical insight: most multi-arm tasks do not require global coordination throughout its full duration, but only during specific moments. We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously, and perform a comprehensive study of different policy architectures with varying levels of centralization on our tasks. Finally, we propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks. Additional results and videos at <a class="link-external link-https" href="https://roboturk.stanford.edu/multiarm" rel="external noopener nofollow">this https URL</a> .

Robust and High-Precision End-to-End Control Policy for Multi-stage Manipulation Task with Behavioral Cloning.

Ensemble Bootstrapped Deep Deterministic Policy Gradient For Vision-Based Robotic Grasping

Leveraging the Efficiency of Multi-Task Robot Manipulation Via Task-Evoked Planner and Reinforcement Learning

Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation

Concept2Robot: Learning Manipulation Concepts from Instructions and Human Demonstrations

Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-To-End Learning from Demonstration

Learning Multi-Arm Manipulation Through Collaborative Teleoperation

Polybot: Training One Policy Across Robots While Embracing Variability

Behavior policy learning: Learning multi-stage tasks via solution sketches and model-based controllers

Exploiting Symmetry and Heuristic Demonstrations in Off-policy Reinforcement Learning for Robotic Manipulation

Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning

Continual Policy Distillation of Reinforcement Learning-based Controllers for Soft Robotic In-Hand Manipulation

Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation

Multi-task Learning with Gradient Guided Policy Specialization

Learning Cross-hand Policies for High-DOF Reaching and Grasping

Visual-Policy Learning through Multi-Camera View to Single-Camera View Knowledge Distillation for Robot Manipulation Tasks

Meta-Policy Learning over Plan Ensembles for Robust Articulated Object Manipulation

Guided Reinforcement Learning for Robust Multi-Contact Loco-Manipulation

Robot Policy Improvement With Natural Evolution Strategies for Stable Nonlinear Dynamical System

BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation