Eglcr: Edge Structure Guidance and Scale Adaptive Attention for Iterative Stereo Matching

Zhien Dai,Zhaohui Tang,Hu Zhang,Can Tian,Mingjun Pan,Yongfang Xie
DOI: https://doi.org/10.1145/3664647.3681372
2024-01-01
Abstract:Stereo matching is a pivotal technique for depth estimation and has been popularly applied in various computer vision tasks. Although many related methods have been reported recently, they still face some challenges such as significant disparity variations at object boundaries, difficult prediction at large disparity regions, and suboptimal generalization when label distribution varies between source and target domains. Therefore, we propose a stereo-matching model (i.e., Eglcr) that utilizes edge structure information and multi-scale matching similarity features for better disparity estimation. First, we use a lightweight network to predict the initial disparity. Then, we develop a multi-scale similarity feature extraction module, incorporating adaptive attention mechanisms, to capture the fusion similarity information of stereo images across various scales. Meanwhile, we introduce an edge structure-aware module that features an iteratively optimized disparity map and a scale attention factor, aimed at accurately delineating edge information in complex scenes. After that, we employ an iterative strategy for disparity estimation, guided by the fusion similarity features across multiple scales and the detailed edge structure information. We conduct abundant experiments on some popular stereo matching datasets including Middlebury, KITTI, ETH3D, and Scene Flow. The results show that our proposed Eglcr achieves state-of-the-art performance both in accuracy and generalization. Our code is available at https://github.com/kangarooCV/Eglcr.
What problem does this paper attempt to address?