Data-efficient 3D instance segmentation by transferring knowledge from synthetic scans

Xiaodong Wu,Ruiping Wang,Xilin Chen
DOI: https://doi.org/10.1016/j.patrec.2024.02.001
IF: 4.757
2024-03-01
Pattern Recognition Letters
Abstract:The 3D comprehension ability of indoor environments is critical for robots. While deep learning-based methods have improved performance, they require significant amounts of annotated training data. Nevertheless, the cost of scanning and annotating point cloud data in real scenes is high, leading to data scarcity. Consequently, there is an urgent need to investigate data-efficient methods for point cloud instance segmentation. To tackle this issue, we propose to leverage the geometric and scene context knowledge inherent in synthetic data to reduce the need for annotation on real data. Specifically, we simulate the process of human scanning and collecting point cloud data in real-world scenes and construct three large-scale synthetic point cloud datasets using synthetic scenes. The scale of these three datasets is more than ten times that of currently available real-world data. Experimental results demonstrate that the incorporation of synthetic point cloud data can increase instance segmentation performance by over 18.8 percentage points. Further, to address the problem of domain shift between synthetic and real data, we propose a target-aware pre-training method. It integrates both real and synthetic data during the pre-training process, allowing the model to learn a feature representation that can effectively generalize to downstream real data. Experiments show that our method achieved stable improvements on all three synthetic datasets. The data and code will be publicly available in the future.
computer science, artificial intelligence
What problem does this paper attempt to address?