Abstract:Due to the data imbalance and the diversity of defects, student-teacher networks (S-T) are favored in unsupervised anomaly detection, which explores the discrepancy in feature representation derived from the knowledge distillation process to recognize anomalies. However, vanilla S-T network is not stable. Employing identical structures to construct the S-T network may weaken the representative discrepancy on anomalies. But using different structures can increase the likelihood of divergent performance on normal data. To address this problem, we propose a novel dual-student knowledge distillation (DSKD) architecture. Different from other S-T networks, we use two student networks a single pre-trained teacher network, where the students have the same scale but inverted structures. This framework can enhance the distillation effect to improve the consistency in recognition of normal data, and simultaneously introduce diversity for anomaly representation. To explore high-dimensional semantic information to capture anomaly clues, we employ two strategies. First, a pyramid matching mode is used to perform knowledge distillation on multi-scale feature maps in the intermediate layers of networks. Second, an interaction is facilitated between the two student networks through a deep feature embedding module, which is inspired by real-world group discussions. In terms of classification, we obtain pixel-wise anomaly segmentation maps by measuring the discrepancy between the output feature maps of the teacher and student networks, from which an anomaly score is computed for sample-wise determination. We evaluate DSKD on three benchmark datasets and probe the effects of internal modules through ablation experiments. The results demonstrate that DSKD can achieve exceptional performance on small models like ResNet18 and effectively improve vanilla S-T networks.

Autoencoder-Like Knowledge Distillation Network for Anomaly Detection

Anomaly detection based on multi-teacher knowledge distillation

Pull & Push: Leveraging Differential Knowledge Distillation for Efficient Unsupervised Anomaly Detection and Localization

A Weakly-Supervised Anomaly Detection Method Via Adversarial Training for Medical Images

Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection

Cosine similarity knowledge distillation for surface anomaly detection

Anomaly Detection via Reverse Distillation from One-Class Embedding

AEKD: Unsupervised auto-encoder knowledge distillation for industrial anomaly detection

Dual-student knowledge distillation for visual anomaly detection

Auto-AD: Autonomous Hyperspectral Anomaly Detection Network Based on Fully Convolutional Autoencoder

VDKD: A ViT-Based Student-Teacher Knowledge Distillation for Multi-Texture Class Anomaly Detection

Unsupervised anomaly detection and localization via bidirectional knowledge distillation

Advancing Pre-trained Teacher: Towards Robust Feature Discrepancy for Anomaly Detection

Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly Detection

Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection

Dual-Modeling Decouple Distillation for Unsupervised Anomaly Detection

Informative knowledge distillation for image anomaly segmentation

Unlocking the Potential of Reverse Distillation for Anomaly Detection

Improved AutoEncoder with LSTM module and KL divergence

Attention-based residual autoencoder for video anomaly detection

Distilling Aggregated Knowledge for Weakly-Supervised Video Anomaly Detection