BridgeData V2: A Dataset for Robot Learning at Scale

Homer Walke,Kevin Black,Abraham Lee,Moo Jin Kim,Max Du,Chongyi Zheng,Tony Zhao,Philippe Hansen-Estruch,Quan Vuong,Andre He,Vivek Myers,Kuan Fang,Chelsea Finn,Sergey Levine
2024-01-18
Abstract:We introduce BridgeData V2, a large and diverse dataset of robotic manipulation behaviors designed to facilitate research on scalable robot learning. BridgeData V2 contains 60,096 trajectories collected across 24 environments on a publicly available low-cost robot. BridgeData V2 provides extensive task and environment variability, leading to skills that can generalize across environments, domains, and institutions, making the dataset a useful resource for a broad range of researchers. Additionally, the dataset is compatible with a wide variety of open-vocabulary, multi-task learning methods conditioned on goal images or natural language instructions. In our experiments, we train 6 state-of-the-art imitation learning and offline reinforcement learning methods on our dataset, and find that they succeed on a suite of tasks requiring varying amounts of generalization. We also demonstrate that the performance of these methods improves with more data and higher capacity models, and that training on a greater variety of skills leads to improved generalization. By publicly sharing BridgeData V2 and our pre-trained models, we aim to accelerate research in scalable robot learning methods. Project page at <a class="link-external link-https" href="https://rail-berkeley.github.io/bridgedata" rel="external noopener nofollow">this https URL</a>
Robotics,Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: how to promote scalable robot learning research by constructing a large-scale, diverse dataset of robotic manipulation behaviors. Specifically, the paper proposes a dataset named BridgeData V2, which aims to support skill generalization across environments, domains, and institutions, and is compatible with various open vocabulary, multi-task learning methods that can be conditioned on goal images or natural language instructions. ### Main Issues: 1. **Skill Generalization**: How to enable robots to exhibit good generalization capabilities across different environments, tasks, and institutions. 2. **Data Diversity**: How to construct a large-scale dataset containing a variety of tasks and environments to support extensive robot learning research. 3. **Multi-task Learning**: How to support flexible task conditioning, guiding robot behavior through goal images or natural language instructions. 4. **Data-driven Robot Learning**: How to utilize large-scale datasets to train high-capacity models, thereby improving the performance of robot learning. ### Solutions: - **BridgeData V2 Dataset**: Contains 60,096 trajectories, covering 13 skills in 24 different environments, including tasks such as grasping and placing, pushing and pulling, sweeping, folding, etc. - **Data Collection Methods**: Data is collected through teleoperation and automated strategies to ensure diversity and quality. - **Experimental Validation**: The dataset is evaluated using 6 state-of-the-art imitation learning and offline reinforcement learning methods to verify its generalization capabilities across different tasks and environments. ### Main Contributions: 1. **Large-scale Dataset**: Provides a larger and more diverse dataset of robotic manipulation behaviors compared to existing datasets. 2. **Generalization Capability**: Demonstrates the dataset's generalization capabilities across different labs and environments. 3. **Multi-task Learning**: Supports various task conditioning methods, including those based on goal images and natural language instructions. 4. **Performance Analysis**: Analyzes the impact of model size, dataset size, and diversity on learning performance through experiments, proving the importance of large-scale datasets. Through these contributions, the paper aims to accelerate research on scalable robot learning methods and provide a valuable resource for the academic community.