Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Yi Zhang,Orestis Plevrakis,Simon S. Du,Xingguo Li,Zhao Song,Sanjeev Arora
DOI: https://doi.org/10.48550/arXiv.2002.06668
2020-02-24
Abstract:Adversarial training is a popular method to give neural nets robustness against adversarial perturbations. In practice adversarial training leads to low robust training loss. However, a rigorous explanation for why this happens under natural conditions is still missing. Recently a convergence theory for standard (non-adversarial) supervised training was developed by various groups for {\em very overparametrized} nets. It is unclear how to extend these results to adversarial training because of the min-max objective. Recently, a first step towards this direction was made by Gao et al. using tools from online learning, but they require the width of the net to be \emph{exponential} in input dimension $d$, and with an unnatural activation function. Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation. Key element of our proof is showing that ReLU networks near initialization can approximate the step function, which may be of independent interest.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of theoretical explanation for the convergence of adversarial training in deep neural networks to low - robust training loss. Specifically, the paper focuses on the following points: 1. **Effectiveness of adversarial training**: Adversarial training is a method to make neural networks robust to small perturbations of the input. Although in practice, adversarial training can effectively reduce the robust training loss, the underlying theoretical mechanism is still unclear. The paper attempts to theoretically explain why adversarial training can achieve this under natural conditions. 2. **Overcoming the curse of dimensionality**: Early research (such as Gao et al. [2019]) shows that in adversarial training, the width and running time of the network need to increase exponentially with the input dimension, which leads to the so - called "curse of dimensionality". The goal of this paper is to prove that within polynomial width and polynomial time, adversarial training can converge to low - robust training loss, thus overcoming this problem. 3. **Application of ReLU activation function**: Previous theoretical work usually uses unrealistic activation functions, while this paper focuses on the commonly - used ReLU activation function, making its results more practical for application. ### Main contributions - **Polynomial width and time complexity**: The paper proves that within polynomial width and polynomial time, a two - layer ReLU neural network can achieve low - robust training loss through adversarial training. - **Existence of approximately initialized networks**: The paper shows that near the Gaussian random initialization, there exists a two - layer ReLU network with polynomial width that can achieve low - robust training loss. - **Polynomial approximation of step functions**: The paper proposes a new approximation theory result, that is, using a two - layer ReLU network with polynomial width to approximate step functions, and this result may have further applications in the theoretical research of over - parameterized networks. ### Research background - **Adversarial examples and defense methods**: Since Szegedy et al. (2013) discovered adversarial examples, many defense methods have been proposed to enhance the robustness of neural networks to perturbations, including adversarial training, certification methods, input transformations, etc. - **Convergence of over - parameterized networks**: In recent years, significant progress has been made in the convergence theory of over - parameterized neural networks in standard (non - adversarial) training. These theories explain how gradient descent can converge to a small training loss within polynomial time, provided that the network is sufficiently over - parameterized. - **Convergence analysis of adversarial training**: Although adversarial training performs well in practice, its convergence analysis is still an open problem. Gao et al. (2019) first attempted to extend the convergence results of standard training to adversarial training, but they required exponential width and running time, and the activation function they used was not commonly used in practice. Through the above contributions, this paper provides a solid theoretical basis for understanding the effectiveness of adversarial training and points out the direction for future research.