Abstract:For CNNs based stereo matching methods, cost volumes play an important role in achieving good matching accuracy. In this paper, we present an end-to-end trainable convolution neural network to fully use cost volumes for stereo matching. Our network consists of three sub-modules, i.e., shared feature extraction, initial disparity estimation, and disparity refinement. Cost volumes are calculated at multiple levels using the shared features, and are used in both initial disparity estimation and disparity refinement sub-modules. To improve the efficiency of disparity refinement, multi-scale feature constancy is introduced to measure the correctness of the initial disparity in feature space. These sub-modules of our network are tightly-coupled, making it compact and easy to train. Moreover, we investigate the problem of developing a robust model to perform well across multiple datasets with different characteristics. We achieve this by introducing a two-stage finetuning scheme to gently transfer the model to target datasets. Specifically, in the first stage, the model is finetuned using both a large synthetic dataset and the target datasets with a relatively large learning rate, while in the second stage the model is trained using only the target datasets with a small learning rate. The proposed method is tested on several benchmarks including the Middlebury 2014, KITTI 2015, ETH3D 2017, and SceneFlow datasets. Experimental results show that our method achieves the state-of-the-art performance on all the datasets. The proposed method also won the 1st prize on the Stereo task of Robust Vision Challenge 2018.

Deep Stereo Matching with Dense CRF Priors

Convolutional neural network based deep conditional random fields for stereo matching

Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

Deep Stereo Matching With Hysteresis Attention and Supervised Cost Volume Construction

Stereo Matching with Local Cost Volume Refinement Network

Stereo Matching Using Multi-Level Cost Volume and Multi-Scale Feature Constancy

Neural Markov Random Field for Stereo Matching

Stacking Learning with Coalesced Cost Filtering for Accurate Stereo Matching

SCV-Stereo: Learning Stereo Matching from a Sparse Cost Volume

CRF-Based Reconstruction from Narrow-Baseline Image Sequences.

Deeply-fused Attentive Network for Stereo Matching

A Robust 3-D Reconstruction Approach Based on Correspondence Retrieval Using Deep Image Prior

Local Similarity Pattern and Cost Self-Reassembling for Deep Stereo Matching Networks

Better Stereo Matching from Simple Yet Effective Wrangling of Deep Features

Multi-scale Cross-form Pyramid Network for Stereo Matching

Robust Cost Volume Generation Method for Dense Stereo Matching in Endoscopic Scenarios

Learning Deep Correspondence Through Prior and Posterior Feature Constancy

Multi-Dimensional Cooperative Network for Stereo Matching

Group-Based Atrous Convolution Stereo Matching Network

Fast Deep Stereo with 2D Convolutional Processing of Cost Signatures

ResDepth: A Deep Residual Prior For 3D Reconstruction From High-resolution Satellite Images