Abstract:With the popularization of consumer-grade low-cost robots, visual SLAM technology has been widely used for robot localization and navigation in indoor and outdoor environments. However, most current visual SLAM systems suffer from poor accuracy and robustness of pose estimation due to their vulnerability to feature extraction challenges, mismatching, and tracking loss in illumination changes and low-texture environments. Aiming to reduce the localization error of the robot and improve its operational stability, this paper proposes a novel stereo vision SLAM system based on deep neural networks. Firstly, in view of the fact that the handcrafted visual features have problems such as unstable feature extraction and inaccurate feature matching in environments with illumination changes and low texture, we designed a visual feature extraction module based on feature extraction network to overcome these issues, which contains a differentiable keypoint detection module, as well as reprojection and dispersed peak losses for accurate and repeatable keypoint training. Secondly, since the feature similarity measure of the bag-of-word algorithm uses Hamming distance, it cannot support the query and matching of visual features of deep neural networks. We improved it and generated a new visual feature dictionary based on deep neural network extraction to ensure the correctness of visual feature query and matching. Finally, in order to further optimize the pose of SLAM system, based on the original vision odometry, we use the redundant pose information of Lidar to complementary fuse the pose information of the visual front end and the laser front end. We apply the algorithm to the KITTI dataset and an actual mobile robot platform. Experimental results show that compared with other methods, the approach proposed in this article effectively enhances the accuracy and robustness of the SLAM system, and also proves the practicability and effectiveness of the method.

Semantic Translation With Convolutional Encoder-Decoder Networks For Viewpoint Estimation

ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation

Semi-Supervised Learning for Visual Bird's Eye View Semantic Segmentation

Unpaired Salient Object Translation Via Spatial Attention Prior

Cad-based Viewpoint Estimation of Texture-Less Object for Purposive Perception Using Domain Adaptation.

Online Indoor Visual Odometry with Semantic Assistance under Implicit Epipolar Constraints

Model-Based Active Viewpoint Transfer For Purposive Perception

Cross-View Semantic Segmentation for Sensing Surroundings

Viewpoint Estimation Using Triplet Loss with A Novel Viewpoint-based Input Selection Strategy

PI-Trans: Parallel-Convmlp and Implicit-Transformation Based Gan for Cross-View Image Translation

Real-time 3D Semantic Scene Perception for Egocentric Robots with Binocular Vision

ADeLA: Automatic Dense Labeling with Attention for Viewpoint Adaptation in Semantic Segmentation

Semantic and Optical Flow Guided Self-supervised Monocular Depth and Ego-Motion Estimation

A Pseudoinverse Siamese Convolutional Neural Network of Transformation Invariance Feature Detection and Description for a SLAM System

Stereo Vision SLAM Based on Feature Extraction Network

GAN-Based Virtual-to-real Image Translation for Urban Scene Semantic Segmentation.

An Approach for Construct Semantic Map with Scene Classification and Object Semantic Segmentation

Domain Adaptation for Viewpoint Estimation with Image Generation

Learning Semantic Segmentation from Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach

Cross-View Image Translation Based on Local and Global Information Guidance