Abstract:3D object detection is one of the fundamental perception tasks for autonomous vehicles. Fulfilling such a task with a 4D millimeter-wave radar is very attractive since the sensor is able to acquire 3D point clouds similar to Lidar while maintaining robust measurements under adverse weather. However, due to the high sparsity and noise associated with the radar point clouds, the performance of the existing methods is still much lower than expected. In this paper, we propose a novel Semi-supervised Cross-modality Knowledge Distillation (SCKD) method for 4D radar-based 3D object detection. It characterizes the capability of learning the feature from a Lidar-radar-fused teacher network with semi-supervised distillation. We first propose an adaptive fusion module in the teacher network to boost its performance. Then, two feature distillation modules are designed to facilitate the cross-modality knowledge transfer. Finally, a semi-supervised output distillation is proposed to increase the effectiveness and flexibility of the distillation framework. With the same network structure, our radar-only student trained by SCKD boosts the mAP by 10.38% over the baseline and outperforms the state-of-the-art works on the VoD dataset. The experiment on ZJUODset also shows 5.12% mAP improvements on the moderate difficulty level over the baseline when extra unlabeled data are available. Code is available at <a class="link-external link-https" href="https://github.com/Ruoyu-Xu/SCKD" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### What problems does this paper attempt to solve? This paper aims to solve the problem of poor 3D object detection performance based on 4D millimeter - wave radar. Specifically, the existing methods based on 4D radar have a performance far below expectations due to the high sparsity and noise problems of point clouds. Although 4D radar performs well in bad weather conditions and can provide 3D point cloud data similar to Lidar, its point cloud density is only one - tenth of that of Lidar, and there are "ghost points" (caused by the multipath effect), which greatly reduces its measurement accuracy. To solve these problems, the paper proposes a new method - **SCKD (Semi - Supervised Cross - modality Knowledge Distillation)**, that is, a semi - supervised cross - modality knowledge distillation method. Through this method, the author hopes to significantly improve the 3D object detection performance based on 4D radar while maintaining real - time performance. The core idea of SCKD is to learn features from a multi - modal fusion teacher network and transfer them to a student network that only uses radar data, thereby enhancing the detection ability of the student. ### Main contributions of SCKD 1. **Proposing a novel semi - supervised cross - modality distillation framework**: By learning knowledge from the teacher network, the simple student network can greatly improve its performance while maintaining real - time efficiency. 2. **Designing an adaptive fusion module**: Embedded in the teacher network to fuse the features of Lidar and radar, thereby improving the performance of the teacher network and reducing the difficulty of knowledge transfer. 3. **Proposing two feature distillation modules**: Namely Lidar - to - radar feature distillation (LRFD) and fusion - to - radar feature distillation (FRFD) respectively, to enhance the effect of feature distillation. 4. **Introducing semi - supervised output distillation (SSOD)**: No longer requiring the ground - truth supervision of the student network, thereby improving the flexibility of the method and being able to utilize a large amount of unlabeled data. 5. **Experimental verification**: Extensive experiments on the VoD and ZJUODset datasets show that SCKD significantly outperforms existing methods, especially in the case of a large amount of unlabeled data. ### Key technologies of the solution - **Teacher network**: Adopting a Lidar - Radar dual - modal fusion network, including an adaptive fusion module, for generating richer semantic information. - **Feature distillation**: Transferring the features of the teacher network to the student network through the LRFD and FRFD modules. - **Semi - supervised output distillation**: Using the predictions of the teacher network as supervision signals, reducing the dependence on expensive labeled data. Through these technological innovations, SCKD not only improves the 3D object detection performance based on 4D radar, but also shows its potential in practical applications, especially in the field of autonomous driving.

SCKD: Semi-Supervised Cross-Modality Knowledge Distillation for 4D Radar Object Detection

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation

X$^3$KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection

Efficient 4D Radar Data Auto-labeling Method using LiDAR-based Object Detection Network

Adaptive Knowledge Distillation for Lightweight Remote Sensing Object Detectors Optimizing

RODNet: Radar Object Detection Using Cross-Modal Supervision

Consistency- and dependence-guided knowledge distillation for object detection in remote sensing images

CenterRadarNet: Joint 3D Object Detection and Tracking Framework using 4D FMCW Radar

Category-oriented Localization Distillation for SAR Object Detection and A Unified Benchmark

Dual Radar: A Multi-modal Dataset with Dual 4D Radar for Autonomous Driving

UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View

LEROjD: Lidar Extended Radar-Only Object Detection

K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions

Towards Efficient 3D Object Detection with Knowledge Distillation

SMURF: Spatial Multi-Representation Fusion for 3D Object Detection with 4D Imaging Radar

Cross-Modal Object Detection via UAV

LabelDistill: Label-guided Cross-modal Knowledge Distillation for Camera-based 3D Object Detection

Enhanced K-Radar: Optimal Density Reduction to Improve Detection Performance and Accessibility of 4D Radar Tensor-based Object Detection

Attention-Based Depth Distillation with 3D-Aware Positional Encoding for Monocular 3D Object Detection