Abstract:Neural network-based stereo matching algorithmms have made significant progress in fields such as robot navigation and autonomous driving. These application scenarios where fast and accurate obtaining the disparity of stereo images is critical for real-time stereo matching decisions. However, current stereo matching algorithms face the challenge of balancing real-time and accuracy while maintaining high accuracy. In this paper, we propose a disparity update strategy based on geometric encoding (MDStereo), which uses global and non-local geometric feature information to update disparity for highly accurate and quick matching. The proposed MDStereo constructs a group of geometric encoding volumes to encode the local information of the image; Next, a new GEDU method for disparity updating is proposed, which retrieves the correlation of high-resolution cost volumes in the form of sampling, and then fuses the geometric encoding information to iteratively update the disparity. Compared to RAFT-Stereo which retrieves correlations from all cascade cost volumes, our GEDU not only provides rich information but also has a more concise architecture. Furthermore, to speed up the inference of the algorithm, we improve the 3D stacked hourglass network, which effectively increases the receptive field and reduces the computational complexity. Our MDStereo has validated its effectiveness and accuracy on several benchmarks, achieving an EPE (end point error) of 0.58 pixels, a 3-pixel error of 2.58%, and a runtime of 43ms on the Scene Flow dataset. At the time of writing, MDStereo outperformed the published real-time methods at the popular KITTI 2012 and KITTI 2015. Compared with existing iteratively updating disparity methods (e.g., RAFT-Stereo), our method reduces the memory consumption by 54% and greatly improves the inference speed.

MA-Stereo: Real-Time Stereo Matching Via Multi-Scale Attention Fusion and Spatial Error-Aware Refinement

Real-time Stereo Vision System Using Adaptive Weight Cost Aggregation Approach

Sparse LIDAR Measurement Fusion with Joint Updating Cost for Fast Stereo Matching

Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement

Improved real-time three-dimensional stereo matching with local consistency

Accurate Real-Time Stereo Correspondence Using Intra- and Inter-Scanline Optimization

EAI-Stereo: Error Aware Iterative Network for Stereo Matching

MC-Stereo: Multi-peak Lookup and Cascade Search Range for Stereo Matching

Stacking Learning with Coalesced Cost Filtering for Accurate Stereo Matching

Self-adaptive Multi-scale Aggregation Network for Stereo Matching.

Superpixel Guided Network for Three-Dimensional Stereo Matching

Deep Stereo Matching With Hysteresis Attention and Supervised Cost Volume Construction

A Fast Stereo Matching Network with Multi-Cross Attention

Multi-Dimensional Attention on Cost Volume for Stereo Matching

Stereo Matching with Space-Constrained Cost Aggregation and Segmentation-Based Disparity Refinement

Stereo Matching in Time: 100+ FPS Video Stereo Matching for Extended Reality

Adaptive Cost Volume Representation for Unsupervised High-resolution Stereo Matching

Practical Stereo Matching via Cascaded Recurrent Network with Adaptive Correlation

Temporally Consistent Stereo Matching

Stereo Matching Method for Remote Sensing Images Based on Attention and Scale Fusion

Stereo Matching Accelerator With Re-Computation Scheme and Data-Reused Pipeline for Autonomous Vehicles