Dynamic Label Adversarial Training for Deep Learning Robustness Against Adversarial Attacks

Zhenyu Liu,Haoran Duan,Huizhi Liang,Yang Long,Vaclav Snasel,Guiseppe Nicosia,Rajiv Ranjan,Varun Ojha

2024-08-23

Abstract:Adversarial training is one of the most effective methods for enhancing model robustness. Recent approaches incorporate adversarial distillation in adversarial training architectures. However, we notice two scenarios of defense methods that limit their performance: (1) Previous methods primarily use static ground truth for adversarial training, but this often causes robust overfitting; (2) The loss functions are either Mean Squared Error or KL-divergence leading to a sub-optimal performance on clean accuracy. To solve those problems, we propose a dynamic label adversarial training (DYNAT) algorithm that enables the target model to gradually and dynamically gain robustness from the guide model's decisions. Additionally, we found that a budgeted dimension of inner optimization for the target model may contribute to the trade-off between clean accuracy and robust accuracy. Therefore, we propose a novel inner optimization method to be incorporated into the adversarial training. This will enable the target model to adaptively search for adversarial examples based on dynamic labels from the guiding model, contributing to the robustness of the target model. Extensive experiments validate the superior performance of our approach.

Machine Learning,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

### The Problem Addressed by the Paper This paper aims to address the robustness issue of deep learning models when facing adversarial attacks. Specifically, the authors point out two main limitations of existing adversarial training methods: 1. **Static Label Problem**: Existing adversarial training methods mainly use static ground truth labels, which often lead to robust overfitting of the model. 2. **Loss Function Problem**: Existing loss functions (such as Mean Squared Error (MSE) or KL divergence) perform poorly in improving the accuracy of the model on clean samples. To overcome these issues, the authors propose a Dynamic Label Adversarial Training (DYNAT) algorithm. This algorithm guides the model's decision-making, allowing the target model to gradually and dynamically gain robustness. Additionally, the authors propose a new internal optimization method to balance the trade-off between clean sample accuracy and robust accuracy. These improvements help enhance the model's robustness under adversarial attacks and its classification performance on clean samples.

Dynamic Label Adversarial Training for Deep Learning Robustness Against Adversarial Attacks

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Adversarial Distributional Training for Robust Deep Learning

Soften to Defend: Towards Adversarial Robustness via Self-Guided Label Refinement

Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better

Enhancing Adversarial Robustness in Low-Label Regime via Adaptively Weighted Regularization and Knowledge Distillation

Deep Defense: Training DNNs with Improved Adversarial Robustness

DeepDefense: Training Deep Neural Networks with Improved Robustness.

Enhancing adversarial robustness for deep metric learning via neural discrete adversarial training

Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing

Dual Head Adversarial Training.

Annealing Self-Distillation Rectification Improves Adversarial Training

Distributed Adversarial Training to Robustify Deep Neural Networks at Scale

Layer-wise Adversarial Defense: an ODE Perspective

LTD: Low Temperature Distillation for Robust Adversarial Training

Adversarial Training with Bi-directional Likelihood Regularization for Visual Classification

Towards Deep Learning Models Resistant to Transfer-based Adversarial Attacks via Data-centric Robust Learning

Better Safe Than Sorry: Preventing Delusive Adversaries with Adversarial Training

Effective and Robust Adversarial Training against Data and Label Corruptions