Towards Deep Learning Models Resistant to Adversarial Attacks

Aleksander Madry,Aleksandar Makelov,Ludwig Schmidt,Dimitris Tsipras,Adrian Vladu

2019-09-05

Abstract:Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples---inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at <a class="link-external link-https" href="https://github.com/MadryLab/mnist_challenge" rel="external noopener nofollow">this https URL</a> and <a class="link-external link-https" href="https://github.com/MadryLab/cifar10_challenge" rel="external noopener nofollow">this https URL</a>.

Machine Learning,Neural and Evolutionary Computing

What problem does this paper attempt to address?

The problem this paper attempts to address is the vulnerability of deep neural networks to adversarial examples. Specifically, these adversarial examples are inputs that are almost indistinguishable from natural data but are misclassified by the network. The paper points out that the existence of adversarial attacks may be an inherent weakness of deep learning models. To tackle this issue, the authors study the adversarial robustness of neural networks from the perspective of robust optimization. This approach not only provides a broad and unified perspective to review previous work but also makes it possible to identify reliable and universal methods for training and attacking neural networks. In particular, these methods can provide a concrete guarantee of being able to withstand any type of adversary. Through these methods, the authors are able to train networks with significant resistance to various adversarial attacks and propose a concept of security against a well-defined class of adversaries, considering it an important step towards achieving fully resistant deep learning models. In short, the core question of the paper is: how to train deep neural networks that can resist adversarial inputs? By introducing the perspective of robust optimization, the authors propose a new method to enhance the adversarial robustness of models, thereby addressing this issue.

Towards Deep Learning Models Resistant to Adversarial Attacks

Towards Deep Learning Models Resistant to Adversarial Attacks

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Towards the first adversarially robust neural network model on MNIST

DeepDefense: Training Deep Neural Networks with Improved Robustness.

Robust Machine Learning Against Adversarial Samples at Test Time

Opportunities and Challenges in Deep Learning Adversarial Robustness: A Survey

Not So Robust After All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks

Optimism in the Face of Adversity: Understanding and Improving Deep Learning Through Adversarial Robustness

Secure Machine Learning Against Adversarial Samples at Test Time

A Framework for Robust Deep Learning Models Against Adversarial Attacks Based on a Protection Layer Approach

Adversarial robustness improvement for deep neural networks

MadNet: Using a MAD Optimization for Defending Against Adversarial Attacks

Deep Defense: Training DNNs with Improved Adversarial Robustness

Intriguing Properties of Adversarial Examples

Attacking Adversarial Attacks as A Defense

Towards Robustness against Unsuspicious Adversarial Examples

Multi-objective Search of Robust Neural Architectures Against Multiple Types of Adversarial Attacks

Towards Imperceptible and Robust Adversarial Example Attacks Against Neural Networks

Impact of Architectural Modifications on Deep Learning Adversarial Robustness

Evolving Robust Neural Architectures to Defend from Adversarial Attacks