Semantic Scene Completion Through Multi-Level Feature Fusion

Ruochong Fu,Hang Wu,Mengxiang Hao,Yubin Miao
DOI: https://doi.org/10.1109/iros47612.2022.9981517
2022-01-01
Abstract:Partial observation of indoor scenes (single-viewed RGB-D) carries insufficient spatial information for complex tasks such as autonomous navigation and virtual reality, thus many learning-based methods are proposed to realize semantic scene completion (SSC) from single-viewed input. However, most of them only extract scene-level features of input to generate output, which might lose details. In this paper, a new method that fully utilizes both instance-level and scene-level features is proposed. Firstly, an object detection module is pre-trained to localize indoor objects. Secondly, coarse completion result is obtained from scene-level feature using an encoder-decoder structure. Finally, based on the pre-trained bounding boxes, coarse completion result is refined using a geometric refinement module. Our network's performance is evaluated on both real and synthetic datasets. The results demonstrate that our network is able to reconstruct indoor scenes with more geometric details, get clearer boundaries between instances and outperform most existing SSC methods both intuitively and quantitatively.
What problem does this paper attempt to address?