Abstract:Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective interventions during policy rollouts. However, collecting a sufficient amount of interventions to cover the distribution of policy mistakes can be burdensome for human operators. We propose IntervenGen (I-Gen), a novel data generation system that can autonomously produce a large set of corrective interventions with rich coverage of the state space from a small number of human interventions. We apply I-Gen to 4 simulated environments and 1 physical environment with object pose estimation error and show that it can increase policy robustness by up to 39x with only 10 human interventions. Videos and more results are available at

What problem does this paper attempt to address?

The paper attempts to address the issue in robot imitation learning where the learned policy's performance degrades during actual deployment due to distribution differences between training data and real-world operating conditions (i.e., distribution shift). Specifically, when a robot makes decisions based on object pose observations, if these observations are affected by factors such as sensor noise, occlusion, network latency, or model mis-specification, it can lead to inaccurate estimation of key object positions. This results in the robot encountering states not present in the training data, causing poor policy performance. To improve the robustness of policies against such distribution shifts, existing methods include collecting a large amount of demonstration data under various conditions or using interactive imitation learning (e.g., DAgger and its variants) where human operators provide corrective interventions during policy execution. However, both methods have significant human labor costs. The former requires substantial time and resources to collect data, while the latter demands continuous monitoring of the robot's task execution by human operators and intervention when necessary, which is also very time-consuming and labor-intensive. The paper proposes a new data generation system called IntervenGen (I-Gen), which aims to automatically generate a large amount of corrective intervention data from a small number of human interventions to cover a broader state space and policy error distribution. This way, with only a small amount of human intervention, the robustness and performance of the policy can be significantly improved, reducing reliance on human operators. Experimental results show that with only 10 human interventions, I-Gen can increase the policy's robustness by up to 39 times, demonstrating good adaptability and robustness in both simulated and physical environments.

IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning

Generalize Robot Learning from Demonstration to Variant Scenarios with Evolutionary Policy Gradient

Human-in-the-Loop Imitation Learning using Remote Teleoperation

DexMimicGen: Automated Data Generation for Bimanual Dexterous Manipulation via Imitation Learning

ReIL: A Framework for Reinforced Intervention-based Imitation Learning

RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation

Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies for Robot Manipulation

Imitating by Generating: Deep Generative Models for Imitation of Interactive Tasks

Visual Imitation Made Easy

SkillMimicGen: Automated Demonstration Generation for Efficient Skill Learning and Deployment

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation

SAFE-GIL: SAFEty Guided Imitation Learning

Conformalized Interactive Imitation Learning: Handling Expert Shift and Intermittent Feedback

Data Scaling Laws in Imitation Learning for Robotic Manipulation

Dreamitate: Real-World Visuomotor Policy Learning via Video Generation

Imitation Learning via Simultaneous Optimization of Policies and Auxiliary Trajectories

Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning

Leveraging Demonstrator-perceived Precision for Safe Interactive Imitation Learning of Clearance-limited Tasks

Learning When to Ask for Help: Efficient Interactive Navigation via Implicit Uncertainty Estimation

Gen2Sim: Scaling up Robot Learning in Simulation with Generative Models

Learning Generative Interactive Environments By Trained Agent Exploration