One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers

Abraham George,Amir Barati Farimani

2023-09-19

Abstract:Learning from human demonstrations (behavior cloning) is a cornerstone of robot learning. However, most behavior cloning algorithms require a large number of demonstrations to learn a task, especially for general tasks that have a large variety of initial conditions. Humans, however, can learn to complete tasks, even complex ones, after only seeing one or two demonstrations. Our work seeks to emulate this ability, using behavior cloning to learn a task given only a single human demonstration. We achieve this goal by using linear transforms to augment the single demonstration, generating a set of trajectories for a wide range of initial conditions. With these demonstrations, we are able to train a behavior cloning agent to successfully complete three block manipulation tasks. Additionally, we developed a novel addition to the temporal ensembling method used by action chunking agents during inference. By incorporating the standard deviation of the action predictions into the ensembling method, our approach is more robust to unforeseen changes in the environment, resulting in significant performance improvements.

Robotics,Artificial Intelligence,Machine Learning

What problem does this paper attempt to address?

The paper attempts to address the issue of low sample efficiency in behavior cloning for robot learning. Specifically, most behavior cloning algorithms require a large amount of demonstration data to learn tasks, especially for general tasks with diverse initial conditions. However, humans can learn to complete complex tasks by observing only 1 or 2 demonstrations. Therefore, the goal of this paper is to train a behavior cloning agent using only a single human demonstration, enabling it to successfully complete the task. To achieve this goal, the authors propose a method to augment the single demonstration through linear transformations, generating a series of trajectories to cover a broad portion of the task state space. Additionally, the authors develop a novel temporal integration method that incorporates the standard deviation of action predictions during inference, making the agent more robust to unexpected changes in the environment, thereby significantly improving performance. The main contributions of the paper include: 1. **Single Demonstration Augmentation**: Generating multiple trajectories through linear transformations to increase the diversity of training data. 2. **Improved Temporal Integration Method**: Introducing a dynamic temperature value based on standard deviation to adjust the temporal integration strategy, enhancing the handling of multi-modal action distributions. 3. **Experimental Validation**: Conducting experiments on 3 block manipulation tasks to validate the effectiveness of the method, with preliminary tests on hardware. These methods enable the behavior cloning agent to successfully learn and complete complex tasks with only a single human demonstration.

One ACT Play: Single Demonstration Behavior Cloning with Action Chunking Transformers

Learning Robot Manipulation Skills from Human Demonstration Videos Using Two-Stream 2-D/3-D Residual Networks with Self-Attention

Behavioral Cloning via Search in Embedded Demonstration Dataset

Behavior Transformers: Cloning $k$ modes with one stone

Transformers for One-Shot Visual Imitation

Learning to Play by Imitating Humans

Never-Ending Behavior-Cloning Agent for Robotic Manipulation

Behavior Cloning and Replay of Humanoid Robot Via a Depth Camera

Problem Space Transformations for Generalisation in Behavioural Cloning

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

RoboAgent: Generalization and Efficiency in Robot Manipulation via Semantic Augmentations and Action Chunking

Behavior Cloned Transformers are Neurosymbolic Reasoners

Towards Learning to Imitate from a Single Video Demonstration

ARCADE: Scalable Demonstration Collection and Generation via Augmented Reality for Imitation Learning

Zero-shot Imitation Policy via Search in Demonstration Dataset

Perceiver-Actor: A Multi-Task Transformer for Robotic Manipulation

Goal-conditioned Behavioral Cloning with Prioritized Sampling

DITTO: Demonstration Imitation by Trajectory Transformation

COLLECT AND PREPARE DATASET FROM HUMAN-DEMONSTRATIONS OF SIMPLE MANIPULATION TASKS USING XARM7 ROBOTIC ARM FOR TRAINING BEHAVIOUR CLONING ALGORITHM

Learning Multi-Step Manipulation Tasks from A Single Human Demonstration

AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent