Large-Scale 3D Semantic Mapping Using Monocular Vision

Cheng Zhang,Zhi Liu,Guangwen Liu,Dandan Huang
DOI: https://doi.org/10.1109/icivc47709.2019.8981035
2019-01-01
Abstract:In recent years, with the rapid development of artificial intelligence, simultaneous localization and mapping technology to solve robot perception problems has been widely used in automatic driving, robot navigation, augmented reality and other fields. In this paper, a method of reconstructing large outdoor 3D dense semantic map based on monocular vision is proposed. Firstly, the motion trajectory of the camera is estimated by visual odometer, and the depth data is obtained by monocular depth estimation algorithm. Meanwhile, a dense conditional random field (CRF) image segmentation system based on deep learning and super-pixel distribution is used for semantic segmentation. Then 2D image semantics are gradually transferred to 3D point clouds by Bayesian progressive label migration strategy and further optimize 3D labels through a novel CRF model. Super pixels are utilized to enforce smoothness and form robust PN high order potential. Finally, dense 3D semantic maps of urban environment is generated by arbitrary length image sequence. The algorithm is tested on KITTI dataset, and the constructed semantic map shows that the proposed algorithm can reconstruct globally consistent semantic map in large-scale outdoor scenes.
What problem does this paper attempt to address?