Abstract:Recently, radar-camera fusion algorithms have gained significant attention as radar sensors provide geometric information that complements the limitations of cameras. However, most existing radar-camera depth estimation algorithms focus solely on improving performance, often neglecting computational efficiency. To address this gap, we propose LiRCDepth, a lightweight radar-camera depth estimation model. We incorporate knowledge distillation to enhance the training process, transferring critical information from a complex teacher model to our lightweight student model in three key domains. Firstly, low-level and high-level features are transferred by incorporating pixel-wise and pair-wise distillation. Additionally, we introduce an uncertainty-aware inter-depth distillation loss to refine intermediate depth maps during decoding. Leveraging our proposed knowledge distillation scheme, the lightweight model achieves a 6.6% improvement in MAE on the nuScenes dataset compared to the model trained without distillation.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in the radar - camera depth estimation task, although the performance of existing algorithms has been improved, computational efficiency is often overlooked. Specifically, most existing methods focus too much on improving model performance while ignoring the lightweight nature and inference speed of the model, resulting in difficulties in efficiently deploying these models in practical applications (such as autonomous driving). To solve this problem, the author proposes a lightweight radar - camera depth estimation model named LiRCDepth. This model uses the Knowledge Distillation technique to transfer the key information of a complex teacher model to a lightweight student model, thereby significantly reducing model parameters and computational complexity while maintaining high performance. ### Main Contributions 1. **Introduction of an Uncertainty - Rectified Depth Loss Function**: In order to guide depth prediction more accurately, the author designs a new loss function - Uncertainty - Rectified Depth Loss (URDL), which can better handle prediction errors. 2. **Lightweight Radar - Camera Depth Estimation Framework**: Compared with existing complex models, LiRCDepth has approximately 80% fewer parameters, and its performance on the nuScenes dataset is comparable to that of heavy - duty models, and even surpasses them in some metrics. 3. **Multi - stage Knowledge Distillation Strategy**: The author enhances the training effect of the lightweight model through knowledge distillation in three key areas: - **Low - level and High - level Feature Distillation**: Transfer the features of the teacher model to the student model through pixel - level and pairwise similarity distillation. - **Intermediate Depth Map Distillation**: Introduce an uncertainty - aware intermediate depth map distillation loss to optimize the generation of depth maps during the decoding process. 4. **Experimental Verification**: Through quantitative and qualitative experiments, the effectiveness and superiority of LiRCDepth are proven, especially its performance in depth estimation in long - distance scenarios is more prominent. ### Formula Summary 1. **Single - modal Feature Distillation Loss**: \[ L_{S - M}^{KD} = L_I^{KD} + L_R^{KD} = \sum_{i = 1}^5\frac{1}{2^i}(\|(F_S^C)_i-(F_T^C)_i\|_1+\|(F_S^R)_i-(F_T^R)_i\|_1) \] 2. **Structure - guided Feature Distillation Loss**: \[ L_{Dec}^{KD} = \sum_{i = 1}^5\frac{1}{2^i}\frac{1}{(W_iH_i)^2}\sum_{p = 1}^{W_iH_i}\sum_{q = 1}^{W_iH_i}((\alpha_S^{p,q})_i - (\alpha_T^{p,q})_i)^2 \] where \(\alpha_{p,q}=\frac{f_p^\top f_q}{\|f_p\|_2\|f_q\|_2}\) is the similarity between pixels. 3. **Uncertainty - aware Intermediate Depth Map Distillation Loss**: \[ L_D^{KD} = \sum_{i = 1}^3\frac{1}{2^i}\|U_i\odot|(D_S^{inter})_i-(D_T^{lpg})_i|\|_1 \] where \(U = J_{H,W,1}-\exp\left(-\frac{|D_{pred}-D_{gt}|}{\beta|D_{pred}+D_{gt}|}\right)\) is the uncertainty map. 4. **Total Loss Function**: \[ L_{total}=L_{Depth}+\gamma_1L_I^{KD}+\gamma_2L_R^{KD}+\gamma_3L_{Dec}^{KD}+

LiRCDepth: Lightweight Radar-Camera Depth Estimation via Knowledge Distillation and Uncertainty Guidance

Semantic-guided Depth Completion from Monocular Images and 4D Radar Data

Self-Paced Knowledge Distillation for Real-Time Image Guided Depth Completion

A Depth Estimation Framework Based on Unsupervised Learning and Cross-Modal Translation

RadarCam-Depth: Radar-Camera Fusion for Depth Estimation with Learned Metric Scale

RIDERS: Radar-Infrared Depth Estimation for Robust Sensing

RCDformer: Transformer-based dense depth estimation by sparse radar and camera

CaFNet: A Confidence-Driven Framework for Radar Camera Depth Estimation

Depth Estimation fusing Image and Radar Measurements with Uncertain Directions

Expanding Sparse LiDAR Depth and Guiding Stereo Matching for Robust Dense Depth Estimation

Depth Estimation from Monocular Images and Sparse Radar Data

ADU-Depth: Attention-based Distillation with Uncertainty Modeling for Depth Estimation

Radar-Camera Pixel Depth Association for Depth Completion

Radar and Camera Fusion for Multi-Task Sensing in Autonomous Driving

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Monocular Depth Estimation from a Fisheye Camera Based on Knowledge Distillation

Depth Estimation from Monocular Images Using Dilated Convolution and Uncertainty Learning.

Enhanced Radar Perception via Multi-Task Learning: Towards Refined Data for Sensor Fusion Applications

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

URCDC-Depth: Uncertainty Rectified Cross-Distillation with CutFlip for Monocular Depth Estimation

Scene-aware refinement network for unsupervised monocular depth estimation in ultra-low altitude oblique photography of UAV