Improving Depth Completion Via Depth Feature Upsampling

Yufei Wang,Ge Zhang,Shaoqian Wang,Bo Li,Qi Liu,Le Hui,Yuchao Dai
DOI: https://doi.org/10.1109/cvpr52733.2024.01994
2024-01-01
Abstract:The encoder-decoder network (ED-Net) is a commonly employed choice for existing depth completion methods, but its working mechanism is ambiguous. In this paper, we vi-sualize the internal feature maps to analyze how the net-work densifies the input sparse depth. We find that the en-coder feature of ED-Net focus on the areas with input depth points around. To obtain a dense feature and thus esti-mate complete depth, the decoder feature tends to comple-ment and enhance the encoder feature by skip-connection to make the fused encoder-decoder feature dense, resulting in the decoder feature also exhibits sparse. However, ED-Net obtains the sparse decoder feature from the dense fused feature at the previous stage, where the “dense-i-sparse‘’ process destroys the completeness of features and loses in-formation. To address this issue, we present a depth feature upsampling network (DFU) that explicitly utilizes these dense features to guide the upsampling of a low-resolution (LR) depth feature to a high-resolution (HR) one. The completeness of features is maintained throughout the up-sampling process, thus avoiding information loss. Fur-thermore, we propose a confidence-aware guidance module (CGM), which is confidence-aware and performs guidance with adaptive receptive fields (GARF), to fully exploit the potential of these dense features as guidance. Experimental results show that our DFU, a plug-and-play module, can significantly improve the performance of existing ED-Net based methods with limited computational overheads, and new SOTA results are achieved. Besides, the generalization capability on sparser depth is also enhanced. Project page: https://npucvr.github.iolDFU.
What problem does this paper attempt to address?