Semantic Visual Simultaneous Localization and Mapping: A Survey

Kaiqi Chen,Jianhua Zhang,Jialing Liu,Qiyi Tong,Ruyu Liu,Shengyong Chen
DOI: https://doi.org/10.48550/arXiv.2209.06428
2022-09-14
Abstract:Visual Simultaneous Localization and Mapping (vSLAM) has achieved great progress in the computer vision and robotics communities, and has been successfully used in many fields such as autonomous robot navigation and AR/VR. However, vSLAM cannot achieve good localization in dynamic and complex environments. Numerous publications have reported that, by combining with the semantic information with vSLAM, the semantic vSLAM systems have the capability of solving the above problems in recent years. Nevertheless, there is no comprehensive survey about semantic vSLAM. To fill the gap, this paper first reviews the development of semantic vSLAM, explicitly focusing on its strengths and differences. Secondly, we explore three main issues of semantic vSLAM: the extraction and association of semantic information, the application of semantic information, and the advantages of semantic vSLAM. Then, we collect and analyze the current state-of-the-art SLAM datasets which have been widely used in semantic vSLAM systems. Finally, we discuss future directions that will provide a blueprint for the future development of semantic vSLAM.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that visual Simultaneous Localization and Mapping (vSLAM) is difficult to achieve good localization in dynamic and complex environments. Specifically, traditional vSLAM methods face many challenges when dealing with illumination changes, moving dynamic objects, and texture - lacking environments, and the geometric maps constructed are difficult to be applied to path planning and navigation. To solve these problems, researchers have begun to explore methods of combining semantic information with vSLAM, that is, semantic vSLAM. By integrating deep - learning techniques to extract feature points, descriptors and semantic information, and perform pose estimation, semantic vSLAM can not only obtain geometric structure information in the environment, but also extract semantic information such as the position, orientation and category of independent objects, thereby improving the accuracy and robustness of localization and constructing different types of semantic maps, such as pixel - level maps and object - level maps. Therefore, semantic vSLAM can help robots perceive and adapt to unknown complex environments more accurately and perform more complex tasks. The main contributions of the paper are that it systematically reviews the development history of semantic vSLAM for the first time, focuses on introducing its advantages and characteristics, and discusses three main problems in semantic vSLAM: extraction and association of semantic information, application of semantic information, and advantages of semantic vSLAM. In addition, the paper also collects and analyzes SLAM data sets currently widely used in semantic vSLAM systems and discusses future development directions, providing a blueprint for the future development of semantic vSLAM.