Self-Supervised Multi-view Stereo Via Adjacent Geometry Guided Volume Completion

Luoyuan Xu,Tao Guan,Yuesong Wang,Yawei Luo,Zhuo Chen,Wenkai Liu,Wei Yang
DOI: https://doi.org/10.1145/3503161.3547926
2022-01-01
Abstract:Existing self-supervised multi-view stereo (MVS) approaches largely rely on photometric consistency for geometry inference, and hence suffer from low-texture or non-Lambertian appearances. In this paper, we observe that adjacent geometry shares certain commonality that can help to infer the correct geometry of the challenging or low-confident regions. Yet exploiting such property in a non-supervised MVS approach remains challenging for the lacking of training data and necessity of ensuring consistency between views. To address the issues, we propose a novel geometry inference training scheme by selectively masking regions with rich textures, where geometry can be well recovered and used for supervisory signal, and then lead a deliberately designed cost volume completion network to learn how to recover geometry of the masked regions. During inference, we then mask the low-confident regions instead and use the cost volume completion network for geometry correction. To deal with the different depth hypotheses of the cost volume pyramid, we design a three-branch volume inference structure for the completion network. Further, by considering plane as a special geometry, we first identify planar regions from pseudo labels and then correct the low-confident pixels by high-confident labels through plane normal consistency. Extensive experiments on DTU and Tanks & Temples demonstrate the effectiveness of the proposed framework and the state-of-the-art performance.
What problem does this paper attempt to address?