NeRF-Aug: Data Augmentation for Robotics with Neural Radiance Fields

Eric Zhu,Mara Levy,Matthew Gwilliam,Abhinav Shrivastava
2024-11-05
Abstract:Training a policy that can generalize to unknown objects is a long standing challenge within the field of robotics. The performance of a policy often drops significantly in situations where an object in the scene was not seen during training. To solve this problem, we present NeRF-Aug, a novel method that is capable of teaching a policy to interact with objects that are not present in the dataset. This approach differs from existing approaches by leveraging the speed and photorealism of a neural radiance field for augmentation. NeRF- Aug both creates more photorealistic data and runs 3.83 times faster than existing methods. We demonstrate the effectiveness of our method on 4 tasks with 11 novel objects that have no expert demonstration data. We achieve an average 69.1% success rate increase over existing methods. See video results at <a class="link-external link-https" href="https://nerf-aug.github.io" rel="external noopener nofollow">this https URL</a>.
Robotics,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is in robotics, training a policy that can generalize to unknown objects. Specifically, when an object in the scene does not appear during the training process, the performance of existing policies usually drops significantly. To overcome this challenge, the paper proposes the NeRF - Aug method, a new method of data augmentation using Neural Radiance Fields (NeRF), aiming to teach the policy how to interact with objects that do not appear in the dataset. This method improves the policy's generalization ability to new objects by generating more realistic synthetic data and being 3.83 times faster than existing methods. ### Main contributions of the paper 1. **Proposing a fast and realistic image - editing framework**: This framework is used to generate synthetic data that can be used for robot policy learning to achieve generalization to new objects. 2. **Using multi - view images to train the NeRF model of new objects**: Capture multi - view images of new objects through the robot arm and learn their NeRF models. 3. **Generating a synthetic dataset through image editing**: Remove the training objects in the existing demonstrations and render new objects through NeRF to generate a synthetic dataset for training robot policies. 4. **Demonstrating effective generalization on four different tasks**: Use the generated synthetic data for training and improve the success rate of the robot when dealing with new objects. ### Method overview 1. **Create the NeRF model of a new object**: Use the camera mounted on the robot arm to collect multi - view images of the new object and train its NeRF model. 2. **Calculate the position of the camera relative to the object**: Use the gripper position of the robot arm to calculate the relative position of the camera. 3. **NeRF rendering**: In each trajectory frame, use the position matrix of the camera relative to the object to query the NeRF model and generate an image of the new object. 4. **Combine NeRF - rendered and original images**: Use a pre - trained segmentation model and an object - erasing tool to remove the original object, and then fuse the NeRF - rendered new object image with the background image to generate a new synthetic image. 5. **Train and evaluate the policy**: Use the generated synthetic dataset to train the robot policy and evaluate it on tasks that include new objects. ### Experimental results - **Task success rate**: In four real - world tasks (grasping, pick - and - place, pick - up, put - down), the average success rate of the policy trained with the synthetic data generated by NeRF - Aug is 69.1%, which is significantly higher than the baseline method. - **Data generation speed**: The speed of NeRF - Aug in generating new data is much faster than that of existing methods, especially in the subsequent data generation of new objects. ### Conclusion NeRF - Aug provides an efficient and realistic data augmentation method that can significantly improve the generalization ability of robot policies to new objects without adding a large number of human demonstrations. Future research can explore other novel view - synthesis methods, such as Gaussian point - painting and Plenoxels, to generate similar augmentation frameworks.