A Multi-Task Network for Multi-View Stereo Reconstruction: when Semantic Consistency Based Clustering Meets Depth Estimation Optimization

Xin Huang,Shulei Zhang,Jiayi Li,Leiguang Wang
DOI: https://doi.org/10.1109/tgrs.2024.3371059
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:We propose a novel network for multiview stereo (MVS) reconstruction in the field of remote sensing, which considers clustering-based semantic consistency into depth estimation optimization, referred to as CSC-MVS. In this approach, high-level semantic information acquired from multiple views is used to construct semantic consistency and assist in guiding the optimization of the MVS network. Specifically, the nonnegative matrix factorization (NMF) branch and the deep spectral decomposition (DSD) branch are designed to generate local and global semantic guidance, respectively. We then propose an uncertainty multitask optimization method to adaptively combine matching and semantic metrics. The performance of CSC-MVS is evaluated on representative benchmarks, including the WHU TLC dataset and LuoJia-MVS dataset, demonstrating its effectiveness and generality across diverse remote sensing scenarios. Comprehensive experimental results show that our CSC-MVS significantly improves the performance of various MVS baseline networks and achieves notable accuracy in depth reconstruction. We also conduct ablation studies to validate the rationality of each component and sensitivity analysis to confirm the robustness and adaptability of our proposed method. The code is available at https://github.com/zsl-whu/csc-mvs.
What problem does this paper attempt to address?