InstKD: Towards Lightweight 3D Object Detection With Instance-Aware Knowledge Distillation
Haonan Zhang,Longjun Liu,Yuqi Huang,Xinyu Lei,Lei Tong,Bihan Wen
DOI: https://doi.org/10.1109/TIV.2024.3401461
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Deep neural network (DNN) is extensively explored for LiDAR-based 3D object detection, a crucial perception task in the field of autonomous driving. However, the presence of redundant parameters and complex computations pose challenges for the practical deployment of DNNs. Despite knowledge distillation (KD) is an effective approach for accelerating models, extremely small number of efforts explore its potential on LiDARbased 3D detectors. Besides, existing studies neglect to elaborately investigate 3D voxel-wise features for compression. To this end, we propose instance-aware knowledge distillation (InstKD) for 3D detector compression. The proposed method conducts KD by fully excavating two types of knowledge related to 3D voxelwise features. Firstly, the 3D voxel-wise feature of teacher is transferred to teach the student. In order to prioritize the knowledge with strong guiding capacity, we introduce expanded bounding box (E-Bbox) to distinguish and balance the foreground and background regions. Besides, we generate contribution map (CM) by calculating the gap between the classification response of teacher and student models to further dynamically balance individual instance for distillation. Secondly, we also align the relation-based knowledge of 3D voxel-wise features between the distillation pairs. To avoid incalculable relation on a massive number of 3D voxel-wise features, we distill the relation among instances selected by E-Bboxes, where the intra-relation of homogeneous instances and inter-relation of heterogeneous instances are transferred in a dual-pathway manner. In the experiments, we compress different models on benchmarks with varying scales. The results demonstrate that our method achieves the lightweight 3D detector with slight performance drop. For example, on KITTI dataset, our 2× compressed SECOND (75.5% parameters and 74.5% FLOPs reduction) achieves 66.83% mAP, surpassing its teacher model. The key code is available at
<uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/zhnxjtu/InstKD.</uri>