A 3D Semantic Visual SLAM in Dynamic Scenes

Shanshan Hu,Dan Li,Gujie Tang,Xiangrong Xu
DOI: https://doi.org/10.1109/icarm52023.2021.9536177
2021-01-01
Abstract:Simultaneous Localization and Mapping (SLAM) are the core function of mobile robots. At present, the mainstream visual SLAM technology is mainly based on static scene assumptions, which only contains some geometric information. Especially in the dynamic scene, it is susceptible to the influence of moving targets and leading to mismatches of attitude estimation, resulting in poor system robustness and positioning accuracy. In this paper, a visual semantic map construction framework named Dynamic Visual Semantic SLAM (DVS_SLAM) in dynamic scenes is proposed under the Oriented FAST and Rotated BRIEF SLAM2 (ORB_SLAM2) framework, which can run on ordinary PC with high real-time performance and localization accuracy. Firstly, the improved multi-view geometry and region growing algorithm are used in the tracking thread to detect moving targets, which could remove dynamic feature points to improve the localization accuracy. Secondly, the SSD-MobileNetV2 lightweight deep learning network is added to obtain 2D information, fusion based on color bumpy supervoxel clustering algorithm is selected to realize 3D target information extraction. Finally, a new 3D map construction thread is also added to build a 3D semantic map suitable for navigation by Octo-Map. Experiments were carried on the TUM RGB-D Datasets and self-built robot platform. It demonstrates that the DVS_SLAM is much better than ORB_SLAM2 in dynamic scenes. Meanwhile, a dense semantic octree map is established to improve the environment perception ability of mobile robots.
What problem does this paper attempt to address?