Abstract:Purpose The prerequisite for most traditional visual simultaneous localization and mapping (V-SLAM) algorithms is that most objects in the environment should be static or in low-speed locomotion. These algorithms rely on geometric information of the environment and restrict the application scenarios with dynamic objects. Semantic segmentation can be used to extract deep features from images to identify dynamic objects in the real world. Therefore, V-SLAM fused with semantic information can reduce the influence from dynamic objects and achieve higher accuracy. This paper aims to present a new semantic stereo V-SLAM method toward outdoor dynamic environments for more accurate pose estimation. Design/methodology/approach First, the Deeplabv3+ semantic segmentation model is adopted to recognize semantic information about dynamic objects in the outdoor scenes. Second, an approach that combines prior knowledge to determine the dynamic hierarchy of moveable objects is proposed, which depends on the pixel movement between frames. Finally, a semantic stereo V-SLAM based on ORB-SLAM2 to calculate accurate trajectory in dynamic environments is presented, which selects corresponding feature points on static regions and eliminates useless feature points on dynamic regions. Findings The proposed method is successfully verified on the public data set KITTI and ZED2 self-collected data set in the real world. The proposed V-SLAM system can extract the semantic information and track feature points steadily in dynamic environments. Absolute pose error and relative pose error are used to evaluate the feasibility of the proposed method. Experimental results show significant improvements in root mean square error and standard deviation error on both the KITTI data set and an unmanned aerial vehicle. That indicates this method can be effectively applied to outdoor environments. Originality/value The main contribution of this study is that a new semantic stereo V-SLAM method is proposed with greater robustness and stability, which reduces the impact of moving objects in dynamic scenes.

Monocular Semantic SLAM using Object-pose-graph Constraints

From Satellite to Ground: Satellite Assisted Visual Localization with Cross-view Semantic Matching

Object SLAM Based on Spatial Layout and Semantic Consistency

Object-aware Semantic Mapping of Indoor Scenes Using Octomap

Monocular Object and Plane SLAM in Structured Environments

Monocular SLAM for Large Scale Scenes

Robust Monocular SLAM in Dynamic Environments

Semi-Dense 3D Semantic Mapping from Monocular SLAM

SemanticSLAM: Learning based Semantic Map Construction and Robust Camera Localization

A Monocular SLAM System with Mask Loop Closing

Utilization of Semantic Planes: Improved Localization and Dense Semantic Map for Monocular SLAM in Urban Environment

Hybrid Semi-Dense 3D Semantic-Topological Mapping From Stereo Visual-Inertial Odometry SLAM With Loop Closure Detection

A semantic visual SLAM towards object selection and tracking optimization

Semantic visual simultaneous localization and mapping (SLAM) using deep learning for dynamic scenes

Semantic stereo visual SLAM toward outdoor dynamic environments based on ORB-SLAM2

Robust Data Association for Object-level Semantic SLAM

SQ-SLAM: Monocular Semantic SLAM Based on Superquadric Object Representation

A Robust Deep Learning Enhanced Monocular SLAM System for Dynamic Environments

Semi-direct Monocular Visual and Visual-Inertial SLAM with Loop Closure Detection

A real-time semantic visual SLAM approach with points and objects

Semantic SLAM for mobile Robots in dynamic environments Based on visual camera sensors