Anomaly detection based on multi-teacher knowledge distillation

Ye Ma,Xu Jiang,Nan Guan,Wang Yi
DOI: https://doi.org/10.1016/j.sysarc.2023.102861
IF: 5.836
2023-03-25
Journal of Systems Architecture
Abstract:Anomaly detection on high-dimensional data is crucial for real-world industrial applications. Recent works adopt the Knowledge Distillation (KD) technique to improve the accuracy of anomaly detection Neural Networks (NN). Most KD-based solutions only adopt a single teacher NN and have not yet fully incorporated the distinct advantages of different NN structures. To fill this gap, this paper proposes a novel Multi-teacher Knowledge Distillation approach, which effectively integrates multiple teachers with importance weights to provide guidance for the accurate anomaly detection of students. However, the importance weights are hard to get when training only with normal data. To overcome this challenge, we use an autoencoder-based reconstruction process to update teacher importance weights. In the meantime, the student model parameters are optimized by giving a set of teacher importance weights. Anomalies are then detected based on the deviations between the outputs of teacher and student, as well as the reconstruction errors through the student network. Our proposed approach is evaluated on both CIFAR10 and MVTec datasets. The results show good performance on both high-level semantic anomaly detection and low-level pixel anomaly detection.
computer science, software engineering, hardware & architecture
What problem does this paper attempt to address?