Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution

Huan Zheng,Wencheng Han,Jianbing Shen
2024-11-06
Abstract:Recovering high-quality depth maps from compressed sources has gained significant attention due to the limitations of consumer-grade depth cameras and the bandwidth restrictions during data transmission. However, current methods still suffer from two challenges. First, bit-depth compression produces a uniform depth representation in regions with subtle variations, hindering the recovery of detailed information. Second, densely distributed random noise reduces the accuracy of estimating the global geometric structure of the scene. To address these challenges, we propose a novel framework, termed geometry-decoupled network (GDNet), for compressed depth map super-resolution that decouples the high-quality depth map reconstruction process by handling global and detailed geometric features separately. To be specific, we propose the fine geometry detail encoder (FGDE), which is designed to aggregate fine geometry details in high-resolution low-level image features while simultaneously enriching them with complementary information from low-resolution context-level image features. In addition, we develop the global geometry encoder (GGE) that aims at suppressing noise and extracting global geometric information effectively via constructing compact feature representation in a low-rank space. We conduct experiments on multiple benchmark datasets, demonstrating that our GDNet significantly outperforms current methods in terms of geometric consistency and detail recovery. In the ECCV 2024 AIM Compressed Depth Upsampling Challenge, our solution won the 1st place award. Our codes will be available.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of recovering high - quality depth maps from compressed sources. Specifically, the author focuses on the problem of depth - map quality degradation caused by the limitations of consumer - level depth cameras and data - transmission - bandwidth limitations. Current methods still have deficiencies when dealing with two main challenges: 1. **Bit - depth compression**: Bit - depth compression will produce a uniform depth representation in areas with subtle changes, which hinders the recovery of detailed information. 2. **Densely distributed random noise**: This kind of noise will reduce the accuracy of estimating the global geometric structure of the scene. To solve these problems, the author proposes a new framework - Geometry - Decoupled Network (GDNet) for super - resolution reconstruction of compressed depth maps. GDNet addresses these challenges by decoupling the high - resolution depth - map reconstruction process into independent processing of global and detailed geometric features. #### Main contributions - **Geometric decoupling strategy**: A decoupling strategy is proposed to learn global and detailed geometric features separately. - **Fine - Geometric - Detail Encoder (FGDE)**: A fine - geometric - detail encoder is designed to preserve fine - geometric details in high - resolution low - level image features and is supplemented with low - resolution context - level image features. - **Global Geometric Encoder (GGE)**: A global geometric encoder is developed to effectively suppress noise and extract global geometric information by constructing a compact feature representation in a low - rank space. - **Excellent performance**: Experiments on multiple benchmark datasets show that GDNet significantly outperforms existing methods and performs well in geometric consistency and detail recovery. ### Summary of mathematical formulas - **Depth - map reconstruction formula**: \[ \hat{D}_{hq}=\text{GDNet}(I, D_{lq}) \] where \(\hat{D}_{hq}\) represents the recovered high - quality depth map, \(D_{lq}\) represents the compressed low - quality depth map, and \(I\) represents the corresponding RGB image. - **SILog loss function**: \[ G_i = \log(\hat{D}_{hq}^i)-\log(D_{hq}^i) \] \[ L_{\text{SILog}}=\alpha\sqrt{\frac{1}{n}\sum_{i}G_i^2-\lambda\left(\frac{1}{n}\sum_{i}G_i\right)^2} \] where \(\hat{D}_{hq}^i\) and \(D_{hq}^i\) respectively represent the values of the predicted high - quality depth map and the real depth map at pixel \(i\), \(n\) is the total number of pixels in the image, and \(\alpha\) and \(\lambda\) are hyperparameters. ### Conclusion By introducing the Geometry - Decoupled Network (GDNet), this research effectively solves two key problems in super - resolution of compressed depth maps: detail loss caused by bit - depth compression and the influence of random noise on the global geometric structure. The experimental results show that GDNet significantly outperforms existing methods on multiple benchmark datasets, demonstrating its superior performance in detail recovery and global geometric consistency.