AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent

Tongzhou Mu,Yijie Guo,Jie Xu,Ankit Goyal,Hao Su,Dieter Fox,Animesh Garg

2024-04-11

Abstract:Encouraged by the remarkable achievements of language and vision foundation models, developing generalist robotic agents through imitation learning, using large demonstration datasets, has become a prominent area of interest in robot learning. The efficacy of imitation learning is heavily reliant on the quantity and quality of the demonstration datasets. In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents. We introduce AdaDemo (Adaptive Online Demonstration Expansion), a general framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset. AdaDemo strategically collects new demonstrations to address the identified weakness in the existing policy, ensuring data efficiency is maximized. Through a comprehensive evaluation on a total of 22 tasks across two robotic manipulation benchmarks (RLBench and Adroit), we demonstrate AdaDemo's capability to progressively improve policy performance by guiding the generation of high-quality demonstration datasets in a data-efficient manner.

Robotics,Machine Learning

What problem does this paper attempt to address?

The problem discussed in this paper is how to efficiently expand demonstration data to train a general-purpose robotic agent. Currently, imitation learning using a large amount of demonstration data has received widespread attention for developing robotic basic models, but the success of this approach heavily relies on the quantity, quality, and diversity of the data. The paper proposes a framework called AdaDemo (Adaptive Online Demonstration Expansion) to improve multi-task policy learning by actively and continuously expanding the demonstration dataset. AdaDemo strategically collects new demonstrations to address weaknesses in existing policies, ensuring maximized data efficiency. Extensive evaluations on the RLBench and Adroit robotic manipulation benchmarks show that AdaDemo can gradually improve policy performance with high data efficiency, reducing the required amount of data by half or one-third compared to methods that uniformly collect more demonstrations.

AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent

Generalize Robot Learning from Demonstration to Variant Scenarios with Evolutionary Policy Gradient

Expert demonstrations guide reward decomposition for multi-agent cooperation

ARCADE: Scalable Demonstration Collection and Generation via Augmented Reality for Imitation Learning

RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking

Efficiently Training On-Policy Actor-Critic Networks in Robotic Deep Reinforcement Learning with Demonstration-like Sampled Exploration

Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments

Learning Generalizable 3D Manipulation With 10 Demonstrations

Demonstration actor critic

Overcoming Exploration in Reinforcement Learning with Demonstrations

Learning with Dual Demonstration Domains: Random Domain-Adaptive Meta-Learning

DiffGen: Robot Demonstration Generation via Differentiable Physics Simulation, Differentiable Rendering, and Vision-Language Model

SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

Cooperative Multi-Agent Policy Gradients with Sub-optimal Demonstration

Learning Complicated Manipulation Skills via Deterministic Policy with Limited Demonstrations

Active Fine-Tuning of Generalist Policies

Domain Adaptation of Visual Policies with a Single Demonstration

Learning Visual Robotic Control Efficiently with Contrastive Pre-training and Data Augmentation

Human Demonstrations are Generalizable Knowledge for Robots