Abstract:The RGB-D camera maintains a limited range for working and is hard to accurately measure the depth information in a far distance. Besides, the RGB-D camera will easily be influenced by strong lighting and other external factors, which will lead to a poor accuracy on the acquired environmental depth information. Recently, deep learning technologies have achieved great success in the visual SLAM area, which can directly learn high-level features from the visual inputs and improve the estimation accuracy of the depth information. Therefore, deep learning technologies maintain the potential to extend the source of the depth information and improve the performance of the SLAM system. However, the existing deep learning-based methods are mainly supervised and require a large amount of ground-truth depth data, which is hard to acquire because of the realistic constraints. In this paper, we first present an unsupervised learning framework, which not only uses image reconstruction for supervising but also exploits the pose estimation method to enhance the supervised signal and add training constraints for the task of monocular depth and camera motion estimation. Furthermore, we successfully exploit our unsupervised learning framework to assist the traditional ORB-SLAM system when the initialization module of ORB-SLAM method could not match enough features. Qualitative and quantitative experiments have shown that our unsupervised learning framework performs the depth estimation task comparable to the supervised methods and outperforms the previous state-of-the-art approach by $13.5\%$ on KITTI dataset. Besides, our unsupervised learning framework could significantly accelerate the initialization process of ORB-SLAM system and effectively improve the accuracy on environmental mapping in strong lighting and weak texture scenes.

Semisupervised learning-based depth estimation with semantic inference guidance

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Monocular Depth Estimation Based on Unsupervised Learning

Learning Depth Via Leveraging Semantics: Self-supervised Monocular Depth Estimation with Both Implicit and Explicit Semantic Guidance

Semantic-Guided Representation Enhancement for Self-supervised Monocular Trained Depth Estimation

SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised Monocular Depth Estimation

Adaptive Semantic Fusion Framework for Unsupervised Monocular Depth Estimation

Semantically-Guided Representation Learning for Self-Supervised Monocular Depth

Depth Estimation of Supervised Monocular Images Based on Semantic Segmentation.

Unsupervised depth estimation from monocular videos with hybrid geometric-refined loss and contextual attention

Unsupervised Learning-based Depth Estimation aided Visual SLAM Approach

Semi-Supervised Monocular Depth Estimation with Left-Right Consistency Using Deep Neural Network

Monocular Depth Estimation Using Self-Supervised Learning with More Effective Geometric Constraints

Rethinking Training Objective for Self-Supervised Monocular Depth Estimation - Semantic Cues to Rescue.

Embodiment: Self-Supervised Depth Estimation Based on Camera Models

Semantic-Aware Depth Super-Resolution in Outdoor Scenes

An Adaptive Unsupervised Learning Framework For Monocular Depth Estimation

Unsupervised Video Depth Estimation Based on Ego-motion and Disparity Consensus

3D Object Aided Self-Supervised Monocular Depth Estimation

Structure-Centric Robust Monocular Depth Estimation via Knowledge Distillation

Semantic and Optical Flow Guided Self-supervised Monocular Depth and Ego-Motion Estimation