RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots

Soroush Nasiriany,Abhiram Maddukuri,Lance Zhang,Adeet Parikh,Aaron Lo,Abhishek Joshi,Ajay Mandlekar,Yuke Zhu
2024-06-05
Abstract:Recent advancements in Artificial Intelligence (AI) have largely been propelled by scaling. In Robotics, scaling is hindered by the lack of access to massive robot datasets. We advocate using realistic physical simulation as a means to scale environments, tasks, and datasets for robot learning methods. We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments. RoboCasa features realistic and diverse scenes focusing on kitchen environments. We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances. We enrich the realism and diversity of our simulation with generative AI tools, such as object assets from text-to-3D models and environment textures from text-to-image models. We design a set of 100 tasks for systematic evaluation, including composite tasks generated by the guidance of large language models. To facilitate learning, we provide high-quality human demonstrations and integrate automated trajectory generation methods to substantially enlarge our datasets with minimal human burden. Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning and show great promise in harnessing simulation data in real-world tasks. Videos and open-source code are available at <a class="link-external link-https" href="https://robocasa.ai/" rel="external noopener nofollow">this https URL</a>
Robotics,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem addressed in this paper is how to obtain large-scale robot training data to improve the learning and generalization capabilities of universal robot models. RoboCasa is a large-scale simulation framework for general robot agents, focused on training in everyday life environments, especially kitchen scenarios. It includes: 1. Diverse assets, such as 120 kitchen scenes and over 2500 3D objects, created using generative AI tools. 2. Support for different types of robots, such as mobile manipulators and humanoid robots. 3. Designed 100 tasks using a large language model, including composite tasks. 4. Provides a large training dataset consisting of over 100,000 trajectories. The paper addresses the scarcity of real-world robot data by simulating and generating a large amount of low-cost and highly realistic data. The researchers utilized generative AI tools to enhance the realism and diversity of the simulations, and designed a set of tasks for systematic evaluation. Additionally, they provided high-quality human demonstrations and automated trajectory generation methods to expand the dataset and alleviate the burden on human labor. The experiments demonstrate that synthetic data shows a clear trend in scaling for large-scale imitation learning and exhibits potential in real-world tasks. Compared to other existing simulation frameworks, RoboCasa offers higher realism, diversity, and cross-modal support, making it the first framework to combine large-scale tasks, scenes, and AI-generated assets.