Abstract:By injecting adversarial examples into training data, adversarial training is promising for improving the robustness of deep learning models. However, most existing adversarial training approaches are based on a specific type of adversarial attack. It may not provide sufficiently representative samples from the adversarial domain, leading to a weak generalization ability on adversarial examples from other attacks. Moreover, during the adversarial training, adversarial perturbations on inputs are usually crafted by fast single-step adversaries so as to scale to large datasets. This work is mainly focused on the adversarial training yet efficient FGSM adversary. In this scenario, it is difficult to train a model with great generalization due to the lack of representative adversarial samples, aka the samples are unable to accurately reflect the adversarial domain. To alleviate this problem, we propose a novel Adversarial Training with Domain Adaptation (ATDA) method. Our intuition is to regard the adversarial training on FGSM adversary as a domain adaption task with limited number of target domain samples. The main idea is to learn a representation that is semantically meaningful and domain invariant on the clean domain as well as the adversarial domain. Empirical evaluations on Fashion-MNIST, SVHN, CIFAR-10 and CIFAR-100 demonstrate that ATDA can greatly improve the generalization of adversarial training and the smoothness of the learned models, and outperforms state-of-the-art methods on standard benchmark datasets. To show the transfer ability of our method, we also extend ATDA to the adversarial training on iterative attacks such as PGD-Adversial Training (PAT) and the defense performance is improved considerably.

Improving Global Adversarial Robustness Generalization With Adversarially Trained GAN

Creative and Diverse Artwork Generation Using Adversarial Networks

Generative Adversarial Trainer: Defense to Adversarial Perturbations with GAN

Rob-GAN: Generator, Discriminator, and Adversarial Attacker

Feature Augmentation for Adversarial Robustness

Evaluation of GAN-Based Model for Adversarial Training

AT-GAN: An Adversarial Generator Model for Non-constrained Adversarial Examples

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

Generating Adversarial Examples with Adversarial Networks

A Direct Approach to Robust Deep Learning Using Adversarial Networks

APE-GAN: Adversarial Perturbation Elimination with GAN.

GRIP-GAN: an Attack-Free Defense Through General Robust Inverse Perturbation

Improving GAN Training via Feature Space Shrinkage

Cycle-Consistent Adversarial GAN: the integration of adversarial attack and defense

GanDef: A GAN based Adversarial Training Defense for Neural Network Classifier

GIU-GANs: Global Information Utilization for Generative Adversarial Networks

Improving the Transferability of Adversarial Examples by Using Generative Adversarial Networks and Data Enhancement

HAD-GAN: A Human-perception Auxiliary Defense GAN to Defend Adversarial Examples

Applying adversarial networks to increase the data efficiency and reliability of Self-Driving Cars

Improving the Speed and Quality of GAN by Adversarial Training

Improving the Generalization of Adversarial Training with Domain Adaptation