Abstract:The neural network approach has been commonly used in computer vision tasks.However, adversarial examples are able to make a neural network generate a false prediction.Adversarial training has been shown to be an effective approach to defend against the impact of adversarial examples.Nevertheless, it requires high computing power and long training time thus limiting its application scenarios.An adversarial examples defense method based on knowledge distillation was proposed, reusing the defense experience from the large datasets to new classification tasks.During distillation, teacher model has the same structure as student model and the feature map vector was used to transfer experience, and clean samples were used for training.Multi-dimensional feature maps were utilized to enhance the semantic information.Furthermore, an attention mechanism based on feature map was proposed, which boosted the effect of distillation by assigning weights to features according to their importance.Experiments were conducted over cifar100 and cifar10 open-source dataset.And various white-box attack algorithms such as FGSM (fast gradient sign method), PGD (project gradient descent) and C＆amp;W (Carlini-Wagner attack) were applied to test the experimental results.The accuracy of the proposed method on Cifar10 clean samples exceeds that of adversarial training and is close to the accuracy of the model trained on clean samples.Under the PGD attack of L2 distance, the efficiency of the proposed method is close to that of adversarial training, which is significantly higher than that of normal training.Moreover, the proposed method is a light-weight adversarial defense method with low learning cost.The computing power requirement is far less than that of adversarial training even if optimization schemes such as attention mechanism and multi-dimensional feature map are added.Knowledge distillation can learn the decision-making experience of normal samples and extract robust features as a neural network learning scheme.It uses a small amount of data to generate accurate and robust models, improves generalization, and reduces the cost of adversarial training.

Adversarial Sparse Teacher: Defense Against Distillation-Based Model Stealing Attacks Using Adversarial Examples

Private Knowledge Transfer via Model Distillation with Generative Adversarial Networks

Distilling Adversarial Robustness Using Heterogeneous Teachers

Distilling the Undistillable: Learning from a Nasty Teacher

Splitting the Difference on Adversarial Training

Anti-Distillation Backdoor Attacks: Backdoors Can Really Survive in Knowledge Distillation

Adversarial Training via Adaptive Knowledge Amalgamation of an Ensemble of Teachers

Adversarial Distillation Based on Slack Matching and Attribution Region Alignment

Enhanced Accuracy and Robustness via Multi-teacher Adversarial Distillation

Mitigating Accuracy-Robustness Trade-off via Balanced Multi-Teacher Adversarial Distillation

Mutual Adversarial Training: Learning together is better than going alone

Improving Defensive Distillation using Teacher Assistant

Adversarial Distillation for Learning with Privileged Provisions

Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge

LTD: Low Temperature Distillation for Robust Adversarial Training

Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples

Adversarial Examples Defense Method Based on Multi-Dimensional Feature Maps Knowledge Distillation

Out of Thin Air: Exploring Data-Free Adversarial Robustness Distillation

Boosting Adversarial Robustness Via Self-Paced Adversarial Training.

Adversarial defense based on distribution transfer

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing