Abstract:The inherent sparsity of LiDAR data often leads to extremely sparse depth maps, which poses a challenge for the development of LiDAR-based egocentric vehicles, such as self-driving cars and mobile robots. To overcome this limitation, guided depth completion methods use calibrated camera images to create precise, dense depth predictions from sparse LiDAR data. However, the extreme reliance on camera image limits its generalization of guided depth completion, especially its robustness to weather and light. In this paper, we aim to utilize camera images only during training phase to improve unguided depth completion, and discard camera in the inference phase. We comprehensively analyze the pivotal role of camera images in the depth completion task and emphasize the significance of the frequency distribution within the local windows, quantitatively demonstrating its substantial contribution. Subsequently, we introduce cross-modality knowledge distillation to align LiDAR features with camera features in the frequency domain, yielding corresponding guidance features. We devise a guidance and selection module to mitigate unavoidable inaccuracies in knowledge distillation, while it can enhance depth features and adeptly selects more precise encoded values from both the guidance branch and the unguided input. To further refine the completion result, we propose a progressive depth completion module incorporating two sub-networks connected by an attention for refinement module. This module produces weighted features from the decoder of the first stage to enhance the features in the encoder of the second stage. We denominate our method as Better Unguided Network (BUNet) and evaluate its efficacy on the KITTI depth completion benchmark and NYUv2 dataset, demonstrating its superiority over methods that exclude camera images during the inference phase.

Progressive Depth Decoupling and Modulating for Flexible Depth Completion

MFF-Net: Towards Efficient Monocular Depth Completion With Multi-Modal Feature Fusion

Semantic-guided Depth Completion from Monocular Images and 4D Radar Data

RigNet++: Semantic Assisted Repetitive Image Guided Network for Depth Completion

A Real-Time Semi-Dense Depth-Guided Depth Completion Network

Towards Better Unguided Depth Completion via Cross-Modality Knowledge Distillation in the Frequency Domain

CASIN: Cascading Interaction Network for Robust Depth Sensing with an Auxiliary Task.

Agspn: Efficient Attention-Gated Spatial Propagation Network for Depth Completion

DenseLiDAR: A Real-Time Pseudo Dense Depth Guided Depth Completion Network

Deep Sparse Depth Completion Using Joint Depth and Normal Estimation.

Dense Depth Completion Based on Piecewise Planar Model

Depth Completion from Sparse LiDAR Data with Depth-Normal Constraints

Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion

DCDepth: Progressive Monocular Depth Estimation in Discrete Cosine Domain

Learning Guided Convolutional Network for Depth Completion

LiDAR Meta Depth Completion

Non-local affinity adaptive acceleration propagation network for generating dense depth maps from LiDAR.

DesNet: Decomposed Scale-Consistent Network for Unsupervised Depth Completion

A Multi-Cue Guidance Network for Depth Completion

PENet: Towards Precise and Efficient Image Guided Depth Completion

Depth Completion via Inductive Fusion of Planar LIDAR and Monocular Camera