Survey on Semantic Stereo Matching / Semantic Depth Estimation

Viny Saajan Victor,Peter Neigel

DOI: https://doi.org/10.48550/arXiv.2109.10123

2021-09-21

Abstract:Stereo matching is one of the widely used techniques for inferring depth from stereo images owing to its robustness and speed. It has become one of the major topics of research since it finds its applications in autonomous driving, robotic navigation, 3D reconstruction, and many other fields. Finding pixel correspondences in non-textured, occluded and reflective areas is the major challenge in stereo matching. Recent developments have shown that semantic cues from image segmentation can be used to improve the results of stereo matching. Many deep neural network architectures have been proposed to leverage the advantages of semantic segmentation in stereo matching. This paper aims to give a comparison among the state of art networks both in terms of accuracy and in terms of speed which are of higher importance in real-time applications.

Computer Vision and Pattern Recognition,Machine Learning

What problem does this paper attempt to address?

This paper aims to address the challenges of stereo matching in finding pixel correspondences in non - textured, occluded and reflective regions. Specifically, the paper explores how to use semantic cues in image segmentation to improve the accuracy of stereo matching. The paper also makes a comparative analysis of the performance of the current state - of - the - art network architectures in terms of accuracy and speed, which are particularly important for real - time applications. By integrating semantic segmentation and depth estimation, the paper proposes several methods to improve the effect of stereo matching. These methods include but are not limited to: 1. **Joint feature extraction**: Extract features that are common to stereo matching and semantic segmentation from the input stereo images. 2. **Disparity estimation**: Use deep convolutional layers to extract features specific to disparity estimation and create a cost volume through regression to obtain an initial disparity map. 3. **Semantic segmentation**: Extract semantic labels of the image, which helps to improve disparity estimation in non - textured, occluded and reflective regions. 4. **Disparity refinement**: Use semantic cues to refine the initial disparity, especially in difficult - to - handle regions, to improve the accuracy of the final disparity map. In addition, the paper also discusses the application of different loss functions, such as $L_1$ smooth loss, Softmax cross - entropy loss, photometric loss, regularization loss, consistency loss, smooth loss and cross - domain discontinuity loss. These loss functions are used to optimize model performance during the training process. In summary, the main objective of this paper is to explore and compare various methods for semantic stereo matching, especially in the trade - off between accuracy and real - time performance, and to provide guidance for practical applications.

Survey on Semantic Stereo Matching / Semantic Depth Estimation

A Survey on Deep Learning Techniques for Stereo-based Depth Estimation

Exploiting Semantic and Boundary Information for Stereo Matching

On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey

On the confidence of stereo matching in a deep-learning era: a quantitative evaluation

Deep Contextual Structure and Semantic Feature Enhancement Stereo Network

A Survey on Deep Stereo Matching in the Twenties

A Survey on Deep Learning Methods for Semantic Image Segmentation in Real-Time

A Semi-Supervised Monocular Stereo Matching Method

On the Importance of Stereo for Accurate Depth Estimation: An Efficient Semi-Supervised Deep Neural Network Approach

Semantic Stereo for Incidental Satellite Images

SemStereo: Semantic-Constrained Stereo Matching Network for Remote Sensing

Survey of the deep learning models for image semantic segmentation

Multi-Scale Binocular Stereo Matching Based on Semantic Association

DEEP LEARNING-BASED STEREO MATCHING FOR HIGH-RESOLUTION SATELLITE IMAGES: A COMPARATIVE EVALUATION

A Hybrid 2D and 3D Convolution Neural Network for Stereo Matching

Deep Learning-Based Stereopsis and Monocular Depth Estimation Techniques: A Review

Adaptive Cost Volume Representation for Unsupervised High-resolution Stereo Matching

Performance Evaluation of Deep Learning Networks for Semantic Segmentation of Traffic Stereo-Pair Images

Object Detection and Localization Using Stereo Cameras

Guiding Deep Learning with Expert Knowledge for Dense Stereo Matching