Abstract:Supervised monocular depth estimation methods based on learning have shown promising results compared with the traditional methods. However, these methods require a large number of high-quality corresponding ground truth depth data as supervision labels. Due to the limitation of acquisition equipment, it is expensive and impractical to record ground truth depth for different scenes. Compared to supervised methods, the self-supervised monocular depth estimation method without using ground truth depth is a promising research direction, but self-supervised depth estimation from a single image is geometrically ambiguous and suboptimal. In this paper, we propose a novel semi-supervised monocular stereo matching method based on existing approaches to improve the accuracy of depth estimation. This idea is inspired by the experimental results of the paper that the depth estimation accuracy of a stereo pair as input is better than that of a monocular view as input in the same self-supervised network model. Therefore, we decompose the monocular depth estimation problem into two sub-problems, a right view synthesized process followed by a semi-supervised stereo matching process. In order to improve the accuracy of the synthetic right view, we innovate beyond the existing view synthesis method Deep3D by adding a left-right consistency constraint and a smoothness constraint. To reduce the error caused by the reconstructed right view, we propose a semi-supervised stereo matching model that makes use of disparity maps generated by a self-supervised stereo matching model as the supervision cues and joint self-supervised cues to optimize the stereo matching network. In the test, the two networks are able to predict the depth map directly from a single image by pipeline connecting. Both procedures not only obey geometric principles, but also improve estimation accuracy. Test results on the KITTI dataset show that this method is superior to the current mainstream monocular self-supervised depth estimation methods under the same condition.

A Semi-Supervised Monocular Stereo Matching Method

Monocular Depth Estimation Based on Unsupervised Learning

Weakly Supervised Monocular Depth Estimation Method Based on Stereo Matching Labels

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

Revealing the Reciprocal Relations Between Self-Supervised Stereo and Monocular Depth Estimation

Single View Stereo Matching

Stereo Matching by Self-supervision of Multiscopic Vision.

Self-Supervised Monocular Depth Estimation with Self-Reference Distillation and Disparity Offset Refinement

Unsupervised Monocular Depth Estimation Via Recursive Stereo Distillation.

Monocular Depth Estimation Using Self-Supervised Learning with More Effective Geometric Constraints

Learning Monocular Depth by Distilling Cross-domain Stereo Networks

Self-Supervised Monocular Depth Estimation with Multi-constraints

Self-Supervised Monocular Depth Learning in Low-Texture Areas

Self-Supervised Learning for Stereo Matching with Self-Improving Ability

Semi-Supervised Monocular Depth Estimation with Left-Right Consistency Using Deep Neural Network

Self-Supervised Monocular Depth Estimation via Binocular Geometric Correlation Learning

Transferring knowledge from monocular completion for self-supervised monocular depth estimation

3D Object Aided Self-Supervised Monocular Depth Estimation

Monocular Weakly Supervised Depth and Pose Estimation Method Based on Multi-Information Fusion

Semi-Supervised Adversarial Monocular Depth Estimation

Self-Supervised Monocular Depth Estimation With Multiscale Perception