Abstract:Deep neural networks can be easily fooled into making incorrect predictions through corruption of the input by adversarial perturbations: human-imperceptible artificial noise. So far adversarial training has been the most successful defense against such adversarial attacks. This work focuses on improving adversarial training to boost adversarial robustness. We first analyze, from an instance-wise perspective, how adversarial vulnerability evolves during adversarial training. We find that during training an overall reduction of adversarial loss is achieved by sacrificing a considerable proportion of training samples to be more vulnerable to adversarial attack, which results in an uneven distribution of adversarial vulnerability among data. Such "uneven vulnerability", is prevalent across several popular robust training methods and, more importantly, relates to overfitting in adversarial training. Motivated by this observation, we propose a new adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training (ISEAT). It jointly smooths both input and weight loss landscapes in an adaptive, instance-specific, way to enhance robustness more for those samples with higher adversarial vulnerability. Extensive experiments demonstrate the superiority of our method over existing defense methods. Noticeably, our method, when combined with the latest data augmentation and semi-supervised learning techniques, achieves state-of-the-art robustness against $\ell_{\infty}$-norm constrained attacks on CIFAR10 of 59.32% for Wide ResNet34-10 without extra data, and 61.55% for Wide ResNet28-10 with extra data. Code is available at <a class="link-external link-https" href="https://github.com/TreeLLi/Instance-adaptive-Smoothness-Enhanced-AT" rel="external noopener nofollow">this https URL</a>.

Data filtering for efficient adversarial training

Boosting Adversarial Training in Safety-Critical Systems Through Boundary Data Selection

GAAT: Group Adaptive Adversarial Training to Improve the Trade-Off Between Robustness and Accuracy

Minimizing Adversarial Training Samples for Robust Image Classifiers: Analysis and Adversarial Example Generator Design

Do we need entire training data for adversarial training?

Cost-Sensitive Robustness against Adversarial Examples

Adversarial Collaborative Filtering for Free

Efficient Robust Training via Backward Smoothing

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Towards sustainable adversarial training with successive perturbation generation

Adversarial Coreset Selection for Efficient Robust Training

Using Single-Step Adversarial Training to Defend Iterative Adversarial Examples

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing

Understanding Robust Overfitting of Adversarial Training and Beyond

Enhancing adversarial robustness with randomized interlayer processing

CATIL: Customized Adversarial Training based on Instance Loss

Towards Robust Detection of Adversarial Examples

Reducing Adversarial Training Cost with Gradient Approximation

Splitting the Difference on Adversarial Training

Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training

Using Intuition from Empirical Properties to Simplify Adversarial Training Defense