CCIL: Continuity-based Data Augmentation for Corrective Imitation Learning

Liyiming Ke,Yunchu Zhang,Abhay Deshpande,Siddhartha Srinivasa,Abhishek Gupta
2024-06-04
Abstract:We present a new technique to enhance the robustness of imitation learning methods by generating corrective data to account for compounding errors and disturbances. While existing methods rely on interactive expert labeling, additional offline datasets, or domain-specific invariances, our approach requires minimal additional assumptions beyond access to expert data. The key insight is to leverage local continuity in the environment dynamics to generate corrective labels. Our method first constructs a dynamics model from the expert demonstration, encouraging local Lipschitz continuity in the learned model. In locally continuous regions, this model allows us to generate corrective labels within the neighborhood of the demonstrations but beyond the actual set of states and actions in the dataset. Training on this augmented data enhances the agent's ability to recover from perturbations and deal with compounding errors. We demonstrate the effectiveness of our generated labels through experiments in a variety of robotics domains in simulation that have distinct forms of continuity and discontinuity, including classic control problems, drone flying, navigation with high-dimensional sensor observations, legged locomotion, and tabletop manipulation.
Robotics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to enhance the robustness of imitation learning methods by generating correction data to address cumulative errors and disturbances. Specifically, existing imitation learning methods often require interactive experts, additional offline datasets, or domain-specific invariances. The method proposed in this paper requires only minimal assumptions beyond access to expert data. **Main Objectives:** - **Enhance the robustness of imitation learning**: Expand the dataset by generating correction labels, enabling imitation learning algorithms to better handle unknown states. - **Utilize local continuity**: Leverage local continuity in environmental dynamics to generate correction labels, thereby extending the model's capabilities beyond the support of expert data. - **Theoretical guarantees**: Provide theoretical guarantees to demonstrate the quality of the generated labels and show how to impose the required local smoothness when training dynamic functions. - **Extensive validation**: Conduct experiments in various robotic simulation environments, covering tasks such as classical control problems, drone flight, high-dimensional sensor navigation, legged locomotion, and tabletop manipulation. ### Main Contributions 1. **Problem Definition**: Formally define correction labels used to enhance the robustness of imitation learning. 2. **Practical Algorithm**: Propose a framework named CCIL (Correction Label Generation based on Continuity), which generates correction labels by learning a dynamics model with local continuity. 3. **Theoretical Guarantees**: Demonstrate how to leverage local continuity in dynamic systems to extend the capabilities of learned models and provide theoretical bounds on the quality of generated labels. 4. **Extensive Empirical Validation**: Conduct extensive experiments in four different robotic simulation environments to validate the effectiveness and robustness of the CCIL method. Through these contributions, the paper demonstrates how to improve the robustness and generalization ability of imitation learning methods by leveraging local continuity in the absence of a large number of expert demonstrations.