Abstract:Deep learning-based stereo matching methods have made remarkable progress in recent years. However, it is still a challenging task to achieve high accuracy in real time. In response to this challenge, we propose a Spatial Attention-Guided Upsampling network (SAGU-Net) for accurate and real-time stereo matching. First, a Spatial Attention-Guided Cost Volume Upsampling (SAG-CVU) module is proposed for upsampling the low-resolution cost volume, which calculates each upsampled matching cost as the sum of neighboring coarse costs under the guidance of spatial attention. Different from the recently popular coarse-to-fine (CTF) strategy that prefers upsampling the coarse disparity map, the low-resolution cost volume is upsampled by the SAG-CVU module which allows more raw information to propagate to subsequent procedures and can alleviate the problem of losing high-frequency information. To ensure fast running speed, a medium-resolution disparity map is directly regressed from the upsampled cost volume and then upsampled to full resolution with a Spatial Attention-Guided Disparity Map Upsampling (SAG-DMU) module. Unlike most CTF-based methods which usually build and aggregate narrow cost volumes iteratively until a full-resolution disparity map is obtained, the SAG-DMU module helps the proposed network avoid the iterative procedure to ensure fast running speed. In addition, we propose a simple yet effective gradient loss function that plays the role of a discontinuity-preserving regularizer, which further improves the overall accuracy, especially at depth discontinuities. These design choices lead to the proposed SAGU-Net which can obtain accurate results in real time. Extensive experimental results demonstrate that SAGU-Net and its variants outperform not only state-of-the-art real-time networks but also many accuracy-oriented models on multiple datasets.

Guided aggregation and disparity refinement for real-time stereo matching

Disparity Estimation Using Multilevel and Global Information

Space-temporal stereo matching for dynamic scene

Improved real-time three-dimensional stereo matching with local consistency

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement

Accurate Real-Time Stereo Correspondence Using Intra- and Inter-Scanline Optimization

Stereo matching algorithm using improved guided filtering

Superpixel Guided Network for Three-Dimensional Stereo Matching

Edge-preserving Guided Filtering Based Cost Aggregation for Stereo Matching.

Adaptive Disparity Computation Using Local and Non-Local Cost Aggregations

Real-time stereo matching with high accuracy via Spatial Attention-Guided Upsampling

UGNet: Uncertainty aware geometry enhanced networks for stereo matching

GPDF-Net: geometric prior-guided stereo matching with disparity fusion refinement

Segment-Based Disparity Refinement With Occlusion Handling for Stereo Matching

Cost Volume Aggregation in Stereo Matching Revisited: A Disparity Classification Perspective

GA-Stereo: A Real-Time Stereo Network Based on the Gradient Flow Shunting Strategy and the Atrous Pyramid Network

Hybrid stereo matching by dynamic programming with enhanced cost entry for real-time depth generation

Learning for Disparity Estimation Through Feature Constancy

Lightweight multi-scale convolutional neural network for real time stereo matching

Stereo Matching with Space-Constrained Cost Aggregation and Segmentation-Based Disparity Refinement

Global Matching-Optimization Network for Stereo Depth Estimation