Abstract:With the popularization of consumer-grade low-cost robots, visual SLAM technology has been widely used for robot localization and navigation in indoor and outdoor environments. However, most current visual SLAM systems suffer from poor accuracy and robustness of pose estimation due to their vulnerability to feature extraction challenges, mismatching, and tracking loss in illumination changes and low-texture environments. Aiming to reduce the localization error of the robot and improve its operational stability, this paper proposes a novel stereo vision SLAM system based on deep neural networks. Firstly, in view of the fact that the handcrafted visual features have problems such as unstable feature extraction and inaccurate feature matching in environments with illumination changes and low texture, we designed a visual feature extraction module based on feature extraction network to overcome these issues, which contains a differentiable keypoint detection module, as well as reprojection and dispersed peak losses for accurate and repeatable keypoint training. Secondly, since the feature similarity measure of the bag-of-word algorithm uses Hamming distance, it cannot support the query and matching of visual features of deep neural networks. We improved it and generated a new visual feature dictionary based on deep neural network extraction to ensure the correctness of visual feature query and matching. Finally, in order to further optimize the pose of SLAM system, based on the original vision odometry, we use the redundant pose information of Lidar to complementary fuse the pose information of the visual front end and the laser front end. We apply the algorithm to the KITTI dataset and an actual mobile robot platform. Experimental results show that compared with other methods, the approach proposed in this article effectively enhances the accuracy and robustness of the SLAM system, and also proves the practicability and effectiveness of the method.

Stereo Vision SLAM Based on Feature Extraction Network

A robust stereo feature-aided semi-direct SLAM system

Stereo Vision Based SLAM Using Rao-Blackwellised Particle Filter

Stereo Vision Based SLAM in Outdoor Environments

DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features.

A deep-learning real-time visual SLAM system based on multi-task feature extraction network and self-supervised feature points

A real-time, robust and versatile visual-SLAM framework based on deep learning networks

A Robust and Efficient SLAM System in Dynamic Environment Based on Deep Features

StereoNeuroBayesSLAM: A Neurobiologically Inspired Stereo Visual SLAM System Based on Direct Sparse Method

A Robust Visual SLAM System in Dynamic Environment

A Monocular Visual SLAM System Augmented by Lightweight Deep Local Feature Extractor Using In-House and Low-Cost LIDAR-camera Integrated Device

DyStSLAM: An Efficient Stereo Vision SLAM System in Dynamic Environment

Light-SLAM: A Robust Deep-Learning Visual SLAM System Based on LightGlue under Challenging Lighting Conditions

A Semi-Dense Feature-based VSLAM System

Robust Stereo Visual SLAM for Dynamic Environments With Moving Object

A Real-time Stereo Visual-Inertial SLAM System Based on Point-and-Line Features

A Real-Time VSLAM Based on Deep Features and Object Detection for Dynamic Environments

A Robust Deep Learning Enhanced Monocular SLAM System for Dynamic Environments

LIFT-SLAM: A deep-learning feature-based monocular visual SLAM method

DVI-SLAM: A Dual Visual Inertial SLAM Network

BASL-AD SLAM: A Robust Deep-Learning Feature-Based Visual SLAM System With Adaptive Motion Model