Stereo Matching Method with Integrated Geometric Encoding for Disparity Refinement
Shujia Ye,Ligang Cao,Chun Yuan,Hao Feng,Lihua Yang,Qianghua Li,Peng Sun,Gang Li
DOI: https://doi.org/10.1109/ijcnn60899.2024.10650651
2024-01-01
Abstract:Neural network-based stereo matching algorithmms have made significant progress in fields such as robot navigation and autonomous driving. These application scenarios where fast and accurate obtaining the disparity of stereo images is critical for real-time stereo matching decisions. However, current stereo matching algorithms face the challenge of balancing real-time and accuracy while maintaining high accuracy. In this paper, we propose a disparity update strategy based on geometric encoding (MDStereo), which uses global and non-local geometric feature information to update disparity for highly accurate and quick matching. The proposed MDStereo constructs a group of geometric encoding volumes to encode the local information of the image; Next, a new GEDU method for disparity updating is proposed, which retrieves the correlation of high-resolution cost volumes in the form of sampling, and then fuses the geometric encoding information to iteratively update the disparity. Compared to RAFT-Stereo which retrieves correlations from all cascade cost volumes, our GEDU not only provides rich information but also has a more concise architecture. Furthermore, to speed up the inference of the algorithm, we improve the 3D stacked hourglass network, which effectively increases the receptive field and reduces the computational complexity. Our MDStereo has validated its effectiveness and accuracy on several benchmarks, achieving an EPE (end point error) of 0.58 pixels, a 3-pixel error of 2.58%, and a runtime of 43ms on the Scene Flow dataset. At the time of writing, MDStereo outperformed the published real-time methods at the popular KITTI 2012 and KITTI 2015. Compared with existing iteratively updating disparity methods (e.g., RAFT-Stereo), our method reduces the memory consumption by 54% and greatly improves the inference speed.