Close the Sim2real Gap via Physically-based Structured Light Synthetic Data Simulation

Kaixin Bai,Lei Zhang,Zhaopeng Chen,Fang Wan,Jianwei Zhang
2024-07-17
Abstract:Despite the substantial progress in deep learning, its adoption in industrial robotics projects remains limited, primarily due to challenges in data acquisition and labeling. Previous sim2real approaches using domain randomization require extensive scene and model optimization. To address these issues, we introduce an innovative physically-based structured light simulation system, generating both RGB and physically realistic depth images, surpassing previous dataset generation tools. We create an RGBD dataset tailored for robotic industrial grasping scenarios and evaluate it across various tasks, including object detection, instance segmentation, and embedding sim2real visual perception in industrial robotic grasping. By reducing the sim2real gap and enhancing deep learning training, we facilitate the application of deep learning models in industrial settings. Project details are available at <a class="link-external link-https" href="https://baikaixinpublic.github.io/structured" rel="external noopener nofollow">this https URL</a> light 3D synthesizer/.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve The paper aims to address the challenges faced when adopting deep learning techniques in industrial robotics projects, particularly the difficulties in data collection and annotation. Specifically, the paper proposes a physics-based structured light synthetic data generation system to produce realistic RGB images and physically accurate depth images, in order to reduce the sim2real gap. By doing so, the researchers hope to improve the application effectiveness of deep learning models in industrial scenarios. ### Specific Problem Description 1. **Data Collection Difficulty**: In computer vision and robotic vision tasks, especially for object segmentation and 6D pose annotation, data collection is very time-consuming and challenging. 2. **Factory Data Acquisition Restrictions**: Due to factory regulations, confidentiality, and security considerations, collecting industrial data for training deep learning models faces numerous obstacles. 3. **Insufficiencies of Existing sim2real Methods**: Existing sim2real methods, such as domain randomization, although capable of generating realistic RGB images, require extensive scene and model optimization work and demand expert-level rendering knowledge. ### Solution To address the above issues, the authors propose a new physics-based gray code structured light camera simulation data generation tool, which includes: - Using the Blender Cycles rendering engine and Optix AI denoiser to generate realistic RGB data and physically accurate depth data. - Creating a training set with physically accurate RGBD data and a real data test set to evaluate the sim2real performance gap and the generalization ability of visual perception tasks (such as object detection and instance segmentation). - Demonstrating the effectiveness of this data generation method in actual robotic tasks. Through these methods, the paper aims to reduce the sim2real gap and improve the performance of deep learning models in industrial environments.