Self-Distilled Depth Refinement with Noisy Poisson Fusion

Jiaqi Li,Yiran Wang,Jinghong Zheng,Zihao Huang,Ke Xian,Zhiguo Cao,Jianming Zhang
2024-10-14
Abstract:Depth refinement aims to infer high-resolution depth with fine-grained edges and details, refining low-resolution results of depth estimation models. The prevailing methods adopt tile-based manners by merging numerous patches, which lacks efficiency and produces inconsistency. Besides, prior arts suffer from fuzzy depth boundaries and limited generalizability. Analyzing the fundamental reasons for these limitations, we model depth refinement as a noisy Poisson fusion problem with local inconsistency and edge deformation noises. We propose the Self-distilled Depth Refinement (SDDR) framework to enforce robustness against the noises, which mainly consists of depth edge representation and edge-based guidance. With noisy depth predictions as input, SDDR generates low-noise depth edge representations as pseudo-labels by coarse-to-fine self-distillation. Edge-based guidance with edge-guided gradient loss and edge-based fusion loss serves as the optimization objective equivalent to Poisson fusion. When depth maps are better refined, the labels also become more noise-free. Our model can acquire strong robustness to the noises, achieving significant improvements in accuracy, edge quality, efficiency, and generalizability on five different benchmarks. Moreover, directly training another model with edge labels produced by SDDR brings improvements, suggesting that our method could help with training robust refinement models in future works.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key problems encountered in the depth map refinement process: 1. **Improvement of the quality of low - resolution depth maps**: Existing depth estimation models usually output low - resolution depth maps, which lack fine - grained edges and details. Therefore, a method is required to transform these low - resolution depth maps into high - resolution depth maps with clear edges and details. 2. **Efficiency and consistency issues**: Currently, the mainstream methods adopt a tile - based approach and improve the quality of depth maps by combining a large number of local patches. However, this method is not only computationally expensive but also prone to the problem of inconsistent depth structures, such as fractures or blurring in the splicing area (e.g., billboards and walls as shown in Figure 1(a)). 3. **Noise and blurred boundaries**: Depth maps generated by existing methods often have noise and blurred boundaries, especially when processing real - scene data, and these problems are more obvious. Highly accurate depth labels are crucial for refining details, but synthetic datasets cannot fully simulate the complexity and diversity of the real world, thus limiting the generalization ability of the model. 4. **Lack of robustness**: Many existing methods rely on synthetic datasets for training, which makes them perform poorly when facing real - world data. In addition, some methods use pseudo - labels as supervision signals, but since these pseudo - labels themselves contain noise, the quality of the finally generated depth maps is not high. To solve the above problems, the authors propose the Self - distilled Depth Refinement (SDDR) framework, which mainly includes the following two aspects: - **Depth Edge Representation**: Generate a low - noise depth edge representation as a pseudo - label through a coarse - to - fine self - distillation process. - **Edge - based Guidance**: Design an edge - guided gradient loss and an edge - based fusion loss to optimize the quality of depth maps. SDDR models depth refinement as a noisy Poisson fusion problem and introduces local inconsistency noise and edge deformation noise to describe the errors in depth prediction. Experimental results show that SDDR significantly outperforms existing methods in multiple benchmark tests, especially in terms of depth accuracy, edge quality, and model efficiency.