When Object Detection Meets Knowledge Distillation: A Survey
Zhihui Li,Pengfei Xu,Xiaojun Chang,Luyao Yang,Yuanyuan Zhang,Lina Yao,Xiaojiang Chen
DOI: https://doi.org/10.1109/tpami.2023.3257546
IF: 23.6
2023-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:Object detection (OD) is a basic computer vision task. To date, there have been many OD algorithms or models for solving different problems. The performance of the current models has gradually improved and their applications have expanded. However, the models have also become more complex, with larger numbers of parameters, making them unsuitable for industrial applications. The knowledge distillation (KD) technology proposed in 2015 was first applied to image classification in the field of computer vision, and quickly expanded to other visual tasks. The reason for this may be that the complex teacher models can transfer knowledge (learned from large-scale data or other multi-modal data) to lightweight student models, thereby achieving model compression and performance improvement. Although KD was only introduced into OD in 2017, recent years have seen a surge in publication of related works, especially in 2021 and 2022. Therefore, this paper presents a comprehensive survey of KD-based OD models over recent years, in the hope of providing researchers with an overview of recent progress in the field. Moreover, we have conducted in-depth analysis of the existing relevant works to ascertain their advantages and related issues, and further explored future research directions, in an attempt to provide researchers with inspiration and incentive to design models for related tasks. In brief, we summarize the basic principle of designing KD-based OD models, describe related KD-based OD tasks (performance improvements for lightweight models, catastrophic forgetting in incremental OD, small object detection (S-OD), weakly/semi-supervised OD, etc.), analyze the novel distillation techniques (different types of distillation loss, the feature interaction between teacher and student models, KD of multi-modal prior information, joint distillation using multiple teacher models, self-feature distillation, etc.), and present an overview of the extended applications on several specific datasets (remote sensing images, 3D point cloud datasets, etc.). After comparing and analyzing the performance of different models on several common datasets, we discuss promising directions for solving some specific OD problems.
computer science, artificial intelligence,engineering, electrical & electronic