Digging into Depth-Adaptive Structure for Guided Depth Super-Resolution

Yue Hou,Lang Nie,Chunyu Lin,Baoqing Guo,Yao Zhao
DOI: https://doi.org/10.1016/j.displa.2024.102752
IF: 3.074
2024-01-01
Displays
Abstract:Depth maps captured by current depth cameras have a lower resolution than RGB images, driving the guided depth super-resolution (GDSR) a prominent research topic. Existing methods usually transfer the structure information of RGB images to guide the restoration of depth maps. However, due to the inherent modality gap, these approaches are prone to introduce spurious edges to the result, known as the “RGB texture over-transferred”. Therefore, accurate feature representation and selective utilization of RGB structure are two key challenges for GDSR. In this paper, we dig into depth-adaptive structure to address the above issues. We first design a Hybrid Encoder to incorporate the advantages of different network architectures into a unified feature extractor, striking a balance between model efficiency and performance to provide comprehensive high-level semantics. Subsequently, we leverage the high-frequency parts of depth maps to optimize those of RGB images through a cross-modal attention mechanism, effectively filtering out the unreasonable components from redundant textures and yielding depth-adaptive structural features. Finally, we integrate Discrete Cosine Transform (DCT) operations for feature reconstruction, enhancing model interpretability and forming the whole DASNet alongside the aforementioned modules. Experimental results demonstrate that our DASNet achieves quality improvements in both depth maps and synthetic views.
What problem does this paper attempt to address?