11 adversarial perturbations of deep neural networks

David Warde-Farley, Ian Goodfellow
2016-12-23
Abstract:The past several years have given rise to two related lines of inquiry in deep learning research that view the training of neural networks through the lens of an adversarial game. The first body of work centers on the surprising result that discriminative classifiers are often highly sensitive to very small perturbations in the input space. This finding has led to algorithms designed to increase classifier robustness, to these perturbations and more generally, by exploiting these “adversarial examples”. The second body of work frames generative model training as an adversarial game, pitting a sample generation process against a classifier trained to discriminate synthesized examples from training data.This chapter describes how to construct adversarial perturbations in Section 11.2, then describes how to use the resulting adversarial examples to improve the robustness of a classifier in Section 11.3. Finally, Section 11.4 …
What problem does this paper attempt to address?