Abstract:Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. In practice adversarial training leads to low robust training loss. However, a rigorous explanation for why this happens under natural conditions is still missing. Recently a convergence theory for standard (non-adversarial) supervised training was developed by various groups for {\em very overparametrized} nets. It is unclear how to extend these results to adversarial training because of the min-max objective. Recently, a first step towards this direction was made by Gao et al. using tools from online learning, but they require the width of the net to be \emph{exponential} in input dimension $d$, and with an unnatural activation function. Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation. Key element of our proof is showing that ReLU networks near initialization can approximate the step function, which may be of independent interest.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of theoretical explanation for the convergence of adversarial training in deep neural networks to low - robust training loss. Specifically, the paper focuses on the following points: 1. **Effectiveness of adversarial training**: Adversarial training is a method to make neural networks robust to small perturbations of the input. Although in practice, adversarial training can effectively reduce the robust training loss, the underlying theoretical mechanism is still unclear. The paper attempts to theoretically explain why adversarial training can achieve this under natural conditions. 2. **Overcoming the curse of dimensionality**: Early research (such as Gao et al. [2019]) shows that in adversarial training, the width and running time of the network need to increase exponentially with the input dimension, which leads to the so - called "curse of dimensionality". The goal of this paper is to prove that within polynomial width and polynomial time, adversarial training can converge to low - robust training loss, thus overcoming this problem. 3. **Application of ReLU activation function**: Previous theoretical work usually uses unrealistic activation functions, while this paper focuses on the commonly - used ReLU activation function, making its results more practical for application. ### Main contributions - **Polynomial width and time complexity**: The paper proves that within polynomial width and polynomial time, a two - layer ReLU neural network can achieve low - robust training loss through adversarial training. - **Existence of approximately initialized networks**: The paper shows that near the Gaussian random initialization, there exists a two - layer ReLU network with polynomial width that can achieve low - robust training loss. - **Polynomial approximation of step functions**: The paper proposes a new approximation theory result, that is, using a two - layer ReLU network with polynomial width to approximate step functions, and this result may have further applications in the theoretical research of over - parameterized networks. ### Research background - **Adversarial examples and defense methods**: Since Szegedy et al. (2013) discovered adversarial examples, many defense methods have been proposed to enhance the robustness of neural networks to perturbations, including adversarial training, certification methods, input transformations, etc. - **Convergence of over - parameterized networks**: In recent years, significant progress has been made in the convergence theory of over - parameterized neural networks in standard (non - adversarial) training. These theories explain how gradient descent can converge to a small training loss within polynomial time, provided that the network is sufficiently over - parameterized. - **Convergence analysis of adversarial training**: Although adversarial training performs well in practice, its convergence analysis is still an open problem. Gao et al. (2019) first attempted to extend the convergence results of standard training to adversarial training, but they required exponential width and running time, and the activation function they used was not commonly used in practice. Through the above contributions, this paper provides a solid theoretical basis for understanding the effectiveness of adversarial training and points out the direction for future research.

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Convergence of Adversarial Training in Overparametrized Neural Networks

L G ] 1 9 Ju n 20 19 Convergence of Adversarial Training in Overparametrized Networks

The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression

Can overfitted deep neural networks in adversarial training generalize? -- An approximation viewpoint

A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks

Over-parametrized neural networks as under-determined linear systems

Over-parameterization and Adversarial Robustness in Neural Networks: An Overview and Empirical Analysis

Stability Analysis and Generalization Bounds of Adversarial Training

The Surprising Harmfulness of Benign Overfitting for Adversarial Robustness

High-dimensional (Group) Adversarial Training in Linear Regression

Overparameterized Linear Regression under Adversarial Attacks

Overfitting in adversarially robust deep learning

Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training

Why Adversarial Training of ReLU Networks Is Difficult?

Low-dimensional Intrinsic Dimension Reveals a Phase Transition in Gradient-Based Learning of Deep Neural Networks

Convergence Analysis for Over-Parameterized Deep Learning

Regularization for Adversarial Robust Learning

Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization

Practical Convex Formulation of Robust One-hidden-layer Neural Network Training

An Improved Analysis of Training Over-parameterized Deep Neural Networks