InstanceFusion: Real‐time Instance‐level 3D Reconstruction Using a Single RGBD Camera

Feixiang Lu,Haotian Peng,Hongyu Wu,Jun Yang,Xinhang Yang,Ruizhi Cao,Liangjun Zhang,Ruigang Yang,Bin Zhou
DOI: https://doi.org/10.1111/cgf.14157
IF: 2.5
2020-10-01
Computer Graphics Forum
Abstract:We present InstanceFusion, a robust real‐time system to detect, segment, and reconstruct instance‐level 3D objects of indoor scenes with a hand‐held RGBD camera. It combines the strengths of deep learning and traditional SLAM techniques to produce visually compelling 3D semantic models. The key success comes from our novel segmentation scheme and the efficient instance‐level data fusion, which are both implemented on GPU. Specifically, for each incoming RGBD frame, we take the advantages of the RGBD features, the 3D point cloud, and the reconstructed model to perform instance‐level segmentation. The corresponding RGBD data along with the instance ID are then fused to the surfel‐based models. In order to sufficiently store and update these data, we design and implement a new data structure using the OpenGL Shading Language. Experimental results show that our method advances the state‐of‐the‐art (SOTA) methods in instance segmentation and data fusion by a big margin. In addition, our instance segmentation improves the precision of 3D reconstruction, especially in the loop closure. InstanceFusion system runs 20.5Hz on a consumer‐level GPU, which supports a number of augmented reality (AR) applications (e.g., 3D model registration, virtual interaction, AR map) and robot applications (e.g., navigation, manipulation, grasping). To facilitate future research and reproduce our system more easily, the source code, data, and the trained model are released on Github: https://github.com/Fancomi2017/InstanceFusion.
computer science, software engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to perform instance - level detection, segmentation and reconstruction of 3D objects in indoor scenes using a single RGBD camera under real - time conditions. Specifically, the authors propose a system named InstanceFusion, which aims to overcome several key challenges in instance - level 3D reconstruction of existing methods: 1. **Instance - level segmentation accuracy**: Traditional 3D reconstruction techniques can often only provide geometric models and lack object - level semantic information. InstanceFusion achieves accurate instance - level segmentation of 3D objects by combining the advantages of deep learning and traditional SLAM techniques. 2. **Real - time performance**: Existing methods either rely on post - processing techniques and lack immediate feedback, or process online but cannot generate high - quality 3D models due to inaccurate 2D segmentation results. InstanceFusion achieves a real - time performance of 20.5Hz on a consumer - grade GPU through an optimized two - stage segmentation algorithm and an efficient instance - level data fusion process. 3. **Robustness and automation**: The system can automatically complete the entire process from the original RGBD stream to the incrementally fused instance - level surface model without any prior scene knowledge or predefined template models, improving the robustness and automation level of the system. 4. **Expansion of application scenarios**: InstanceFusion not only supports multiple augmented reality (AR) applications, such as 3D model registration, virtual interaction, AR maps, but is also suitable for tasks such as robot navigation, manipulation and grasping. In summary, InstanceFusion aims to solve the problem of instance - level detection, segmentation and reconstruction of 3D objects in indoor scenes through an efficient, accurate and real - time method, providing strong technical support for AR and robot applications.