OMNET: Real-Time Stereo Matching with Unsupervised Occlusion Mask.

Weiqi Wang,Shuiqiang Ye,Xinan Wang,Yong Zhao
DOI: https://doi.org/10.1109/icip46576.2022.9897748
2022-01-01
Abstract:Although CNN has powerful learning capability, it is still difficult for CNN to judge corresponding points in the occlusion region. The ghosting effect in the occlusion region during feature warping is the bottleneck of performance improvement for many stereo matching networks. In this paper, we propose an Occlusion-Aware Refinement Module (OARM), which can learn a rough occlusion map from multi-scale aggregated cost volumes without occlusion supervision to mask and filter pernicious occluded regions in a warped image. Cooperating with well-designed simple yet efficient 2D-based Intra/Cross-Level Aggregation Modules to effectively and efficiently aggregate information of different scales before the disparity refinement stage, OARM helps our proposed OM-Net achieve an error rate of 1.82% (D1-all) on KITTI 2015 dataset, which is even better than most 3D-based networks. Meanwhile, OMNet keeps real-time characteristic and could process a 1248×384 resolution image pair at 26 fps.
What problem does this paper attempt to address?