AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent

Tongzhou Mu,Yijie Guo,Jie Xu,Ankit Goyal,Hao Su,Dieter Fox,Animesh Garg
2024-04-11
Abstract:Encouraged by the remarkable achievements of language and vision foundation models, developing generalist robotic agents through imitation learning, using large demonstration datasets, has become a prominent area of interest in robot learning. The efficacy of imitation learning is heavily reliant on the quantity and quality of the demonstration datasets. In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents. We introduce AdaDemo (Adaptive Online Demonstration Expansion), a general framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset. AdaDemo strategically collects new demonstrations to address the identified weakness in the existing policy, ensuring data efficiency is maximized. Through a comprehensive evaluation on a total of 22 tasks across two robotic manipulation benchmarks (RLBench and Adroit), we demonstrate AdaDemo's capability to progressively improve policy performance by guiding the generation of high-quality demonstration datasets in a data-efficient manner.
Robotics,Machine Learning
What problem does this paper attempt to address?
The problem discussed in this paper is how to efficiently expand demonstration data to train a general-purpose robotic agent. Currently, imitation learning using a large amount of demonstration data has received widespread attention for developing robotic basic models, but the success of this approach heavily relies on the quantity, quality, and diversity of the data. The paper proposes a framework called AdaDemo (Adaptive Online Demonstration Expansion) to improve multi-task policy learning by actively and continuously expanding the demonstration dataset. AdaDemo strategically collects new demonstrations to address weaknesses in existing policies, ensuring maximized data efficiency. Extensive evaluations on the RLBench and Adroit robotic manipulation benchmarks show that AdaDemo can gradually improve policy performance with high data efficiency, reducing the required amount of data by half or one-third compared to methods that uniformly collect more demonstrations.