Abstract:Existing depth completion methods are often targeted at a specific sparse depth type and generalize poorly across task domains. We present a method to complete sparse/semi-dense, noisy, and potentially low-resolution depth maps obtained by various range sensors, including those in modern mobile phones, or by multi-view reconstruction algorithms. Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model. We propose an effective training scheme where we simulate various sparsity patterns in typical task domains. In addition, we design two new benchmarks to evaluate the generalizability and the robustness of depth completion methods. Our simple method shows superior cross-domain generalization ability against state-of-the-art depth completion methods, introducing a practical solution to high-quality depth capture on a mobile device. The code is available at: <a class="link-external link-https" href="https://github.com/YvanYin/FillDepth" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The problem this paper attempts to address is the poor generalization ability and sensitivity to noise of existing depth completion methods across different task domains. Specifically, existing depth completion methods are usually tailored for specific types of sparse depth, and their performance significantly degrades when applied to other types of sparse depth or in the presence of noise. To solve these issues, the authors propose a new method aimed at improving the generalization ability and robustness to noise of depth completion models across different task domains. ### Main Contributions 1. **Analysis of Existing Methods**: The authors analyze the generalization ability and robustness to noise of existing "RGB + sparse depth" depth completion methods across different domains. 2. **Proposing a New Method**: A new method is proposed that combines data-driven single image priors and effective data augmentation techniques to achieve cross-domain depth completion. 3. **Designing New Benchmarks**: Two new synthetic benchmarks are designed to evaluate the robustness and generalization ability of depth completion methods. These benchmarks simulate challenges in various real-world application scenarios. ### Method Overview - **Model Architecture and Training**: The method uses RGB images, sparse depth maps, and a guided depth map as inputs to output a dense completed depth map. The model architecture is based on the ESANet-R34-NBt1D network and is supervised using multiple loss functions (such as virtual normal loss, pairwise normal regression loss, ranking margin loss, and L1 loss). - **Sparse Pattern Generation**: To increase the model's robustness, the authors simulate various sparse patterns during training, including uniform distribution, feature point sampling, and region occlusion. - **Improving Robustness to Outliers**: By introducing outliers during training, the model is encouraged to learn how to handle inconsistent depth information. - **Benchmark Redesign**: Two new benchmarks are proposed to evaluate the model's generalization ability under different sparse patterns and its robustness to noise. ### Experimental Results - **Existing Benchmarks**: On the NYU and Matterport3D benchmarks, the method performs comparably to existing methods, despite not being trained on these datasets. - **New Benchmarks**: On the newly designed benchmarks, the method demonstrates stronger generalization ability and robustness to noise. - **Mobile Device Testing**: On the DualPixel dataset, the method further showcases its generalization ability on mobile sensors. In summary, this paper effectively addresses the issues of generalization ability and robustness to noise in different task domains of existing methods by proposing a new depth completion method and designing new benchmarks, providing a practical solution for high-quality depth sensing on low-cost mobile devices.

Towards Domain-agnostic Depth Completion

Least Square Estimation Network for Depth Completion

Semantic-guided Depth Completion from Monocular Images and 4D Radar Data

MFF-Net: Towards Efficient Monocular Depth Completion With Multi-Modal Feature Fusion

Aggregating Feature Point Cloud for Depth Completion

RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion

Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints

Deep Depth Completion from Extremely Sparse Data: A Survey

Recent Advances in Conventional and Deep Learning-Based Depth Completion: A Survey

Depth Completion Towards Different Sensor Configurations Via Relative Depth Map Estimation and Scale Recovery

A Real-Time Semi-Dense Depth-Guided Depth Completion Network

Self-Supervised Depth Completion Via Adaptive Sampling and Relative Consistency

Depth-Independent Depth Completion via Least Square Estimation

Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data

Towards Better Unguided Depth Completion via Cross-Modality Knowledge Distillation in the Frequency Domain

Self-supervised Sparse-to-Dense: Self-supervised Depth Completion from LiDAR and Monocular Camera

Learning an Efficient Multimodal Depth Completion Model

Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps

Efficient Depth Completion Using Learned Bases

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network