Abstract:Simultaneous localization and mapping (SLAM) is a fundamental problem in robotics and computer vision. It involves the task of a robot or an autonomous system navigating an unknown environment, simultaneously creating a map of the surroundings, and accurately estimating its position within that map. While significant progress has been made in SLAM over the years, challenges still need to be addressed. One prominent issue is robustness and accuracy in dynamic environments, which can cause uncertainties and errors in the estimation process. Traditional methods using temporal information to differentiate static and dynamic objects have limitations in accuracy and applicability. Nowadays, many research trends have leaned towards utilizing deep learning-based methods which leverage the capabilities to handle dynamic objects, semantic segmentation, and motion estimation, aiming to improve accuracy and adaptability in complex scenes. This article proposed an approach to enhance monocular visual odometry's robustness and precision in dynamic environments. An enhanced algorithm using the semantic segmentation algorithm DeeplabV3+ is used to identify dynamic objects in the image and then apply the motion consistency check to remove feature points belonging to dynamic objects. The remaining static feature points are then used for feature matching and pose estimation based on ORB-SLAM2 using the Technical University of Munich (TUM) dataset. Experimental results show that our method outperforms traditional visual odometry methods in accuracy and robustness, especially in dynamic environments. By eliminating the influence of moving objects, our method improves the accuracy and robustness of visual odometry in dynamic environments. Compared to the traditional ORB-SLAM2, the results show that the system significantly reduces the absolute trajectory error and the relative pose error in dynamic scenes. Our approach has significantly improved the accuracy and robustness of the SLAM system's pose estimation.

Sequence Searching With Deep-Learnt Depth For Condition-And Viewpointin-Variant Route-Based Place Recognition

Sequence searching with deep-learnt depth for condition- and viewpoint-invariant route-based place recognition

Unifying Terrain Awareness Through Real-Time Semantic Segmentation

3D LiDAR-Based Global Localization Using Siamese Neural Network

Leveraging Local Planar Motion Property for Robust Visual Matching and Localization.

Don't Look Back: Robustifying Place Categorization for Viewpoint- and Condition-Invariant Place Recognition

A Novel Approach to Image-Sequence-Based Mobile Robot Place Recognition

Visual Place Recognition for Opposite Viewpoints and Environment Changes

Why-So-Deep: Towards Boosting Previously Trained Models for Visual Place Recognition

A Novel Place Recognition Network Using Visual Sequences and LiDAR Point Clouds for Autonomous Vehicles

SeqNetVLAD vs PointNetVLAD: Image Sequence vs 3D Point Clouds for Day-Night Place Recognition

Monocular Visual Place Recognition in LiDAR Maps via Cross-Modal State Space Model and Multi-View Matching

Vision-based place recognition: how low can you go?

DeepPointMap2: Accurate and Robust LiDAR-Visual SLAM with Neural Descriptors

Simultaneous Viewpoint- and Condition-invariant Loop Closure Detection based on LiDAR Descriptor for Outdoor Large-scale Environments

Learning robust representation and sequence constraint for retrieval-based long-term visual place recognition

Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions

A Monocular Visual SLAM System Augmented by Lightweight Deep Local Feature Extractor Using In-House and Low-Cost LIDAR-camera Integrated Device

SeqOT: A Spatial-Temporal Transformer Network for Place Recognition Using Sequential LiDAR Data

A Pseudoinverse Siamese Convolutional Neural Network of Transformation Invariance Feature Detection and Description for a SLAM System

Semantic visual simultaneous localization and mapping (SLAM) using deep learning for dynamic scenes