One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers

Abraham George,Amir Barati Farimani
2023-09-19
Abstract:Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emulate this ability, using behavior cloning to learn a task given only a single human demonstration. We achieve this goal by using linear transforms to augment the single demonstration, generating a set of trajectories for a wide range of initial conditions. With these demonstrations, we are able to train a behavior cloning agent to successfully complete three block manipulation tasks. Additionally, we developed a novel addition to the temporal ensembling method used by action chunking agents during inference. By incorporating the standard deviation of the action predictions into the ensembling method, our approach is more robust to unforeseen changes in the environment, resulting in significant performance improvements.
Robotics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the issue of low sample efficiency in behavior cloning for robot learning. Specifically, most behavior cloning algorithms require a large amount of demonstration data to learn tasks, especially for general tasks with diverse initial conditions. However, humans can learn to complete complex tasks by observing only 1 or 2 demonstrations. Therefore, the goal of this paper is to train a behavior cloning agent using only a single human demonstration, enabling it to successfully complete the task. To achieve this goal, the authors propose a method to augment the single demonstration through linear transformations, generating a series of trajectories to cover a broad portion of the task state space. Additionally, the authors develop a novel temporal integration method that incorporates the standard deviation of action predictions during inference, making the agent more robust to unexpected changes in the environment, thereby significantly improving performance. The main contributions of the paper include: 1. **Single Demonstration Augmentation**: Generating multiple trajectories through linear transformations to increase the diversity of training data. 2. **Improved Temporal Integration Method**: Introducing a dynamic temperature value based on standard deviation to adjust the temporal integration strategy, enhancing the handling of multi-modal action distributions. 3. **Experimental Validation**: Conducting experiments on 3 block manipulation tasks to validate the effectiveness of the method, with preliminary tests on hardware. These methods enable the behavior cloning agent to successfully learn and complete complex tasks with only a single human demonstration.