Semantic Reconstruction based on RGB Image and Sparse Depth

Yu Cai,Yinzhang Ding,Dongxiao Li,Ming Zhang
DOI: https://doi.org/10.1117/12.2589405
2021-01-01
Abstract:With the increasing popularity of applications such as unmanned driving, the ability of environment perception has become more and more important, and the most common expression of environment perception is semantic reconstruction. Therefore, more and more researchers are trying to synthesize the information from multiple sensors to achieve better semantic reconstruction effects. However, most of the current estimation methods (a). Too bulky to run in real-time (b). Failure to effectively use the information of a variety of different sensors (c). Failure to generate sufficient environmental perception information under limited computing power, such as semantic information and depth information. Therefore, this paper proposes a multi-modal joint estimation network for semantic reconstruction, which can solve the above problems. Our method takes RGB image and sparse depth as input. By adding multi-scale information to the neural network, it outputs semantic segmentation and depth recovery results simultaneously while maintaining light-weighted and real-time performance, then fuses both results in point clouds to get better environment perception ability. A large number of experiments show that our method has better performance than other methods in the same application scenario.
What problem does this paper attempt to address?