Abstract:The authors' method allows the target model to distill the most instant robust and non‐robust knowledge from the previous iteration. To avoid storing model parameters to generate AEs, an existing self‐distillation algorithm was extended, making each "distilling‐batch" participate in multiple consecutive iterations. Experimental results illustrate that our proposed method further mitigates over‐fitting issues and improves robustness without teachers. Adversarial training suffers from poor effectiveness due to the challenging optimisation of loss with hard labels. To address this issue, adversarial distillation has emerged as a potential solution, encouraging target models to mimic the output of the teachers. However, reliance on pre‐training teachers leads to additional training costs and raises concerns about the reliability of their knowledge. Furthermore, existing methods fail to consider the significant differences in unconfident samples between early and late stages, potentially resulting in robust overfitting. An adversarial defence method named Clean, Performance‐robust, and Performance‐sensitive Historical Information based Adversarial Self‐Distillation (CPr & PsHI‐ASD) is presented. Firstly, an adversarial self‐distillation replacement method based on clean, performance‐robust, and performance‐sensitive historical information is developed to eliminate pre‐training costs and enhance guidance reliability for the target model. Secondly, adversarial self‐distillation algorithms that leverage knowledge distilled from the previous iteration are introduced to facilitate the self‐distillation of adversarial knowledge and mitigate the problem of robust overfitting. Experiments are conducted to evaluate the performance of the proposed method on CIFAR‐10, CIFAR‐100, and Tiny‐ImageNet datasets. The results demonstrate that the CPr&PsHI‐ASD method is more effective than existing adversarial distillation methods in enhancing adversarial robustness and mitigating robust overfitting issues against various adversarial attacks.

Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation

Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation

Boosting Adversarial Robustness Distillation Via Hybrid Decomposed Knowledge.

Improving Adversarial Robustness Through a Curriculum-Guided Reliable Distillation

GAAT: Group Adaptive Adversarial Training to Improve the Trade-Off Between Robustness and Accuracy

Towards Desirable Decision Boundary by Moderate-Margin Adversarial Training

Distilling Adversarial Robustness Using Heterogeneous Teachers

Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better

Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation

Rethinking Adversarial Robustness Distillation VIA Strength-Dependent Adaptive Regularization

Adversarially Robust Distillation

Improving Adversarial Robustness Via Information Bottleneck Distillation

Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation

Reliable Adversarial Distillation with Unreliable Teachers

LTD: Low Temperature Distillation for Robust Adversarial Training

Adversarial Contrastive Distillation with Adaptive Denoising

Mutual Adversarial Training: Learning together is better than going alone

Revisiting the Benefits of Knowledge Distillation Against Adversarial Examples

How and When Adversarial Robustness Transfers in Knowledge Distillation?

Clean, performance‐robust, and performance‐sensitive historical information based adversarial self‐distillation