CVE-Net: Cost Volume Enhanced Network Guided by Sparse Features for Stereo Matching.

Xu Qingzhen,Guangdong University of Foreign Studies,Zeng Kun,Gong Yongyi,Luo Xiaonan
DOI: https://doi.org/10.1007/s00500-021-06257-4
IF: 3.732
2021-01-01
Soft Computing
Abstract:Deep learning based on convolutional neural network (CNN) has been successfully applied to stereo matching as it can accelerate the training process and improve the matching accuracy. However, the existing stereo matching framework based on CNN often has two problems. The first problem is the generalization ability of training model. Stereo matching frameworks are usually pre-trained on a large synthetic Scene Flow dataset and then fine-tuned on evaluation dataset. However, the evaluation dataset may contain trivial training data or even do not have disparity label for some specified tasks. This adversely affects the generality of the training model. The second problem is the poor matching performance in ill-posed regions. It is difficult to distinguish the ill-posed regions, including weak texture area, repeated texture area, occlusion area, reflection structure, and fine structure, etc. To ameliorate the aforementioned problems, we propose the cost volume enhancement network (CVE-Net) guided by sparse features for stereo matching. CVE-Net use the edge information and saliency information for sparsely sampling the precise disparity labels during training. Furthermore, we enhance the cost volume by leveraging the precise disparity sparse label information to guide the direction of training. The experiment shows that the generalization ability is significantly improved. The domain-transferring problem on the new dataset is significantly alleviated. In addition, introducing the sparse multiple semantic features improves the matching performance in the ill-posed regions. Even without fine-tuning, the matching requirements can be met. These results demonstrate the effectiveness of the CVE-Net.
What problem does this paper attempt to address?