Abstract:Catastrophic overfitting (CO) in single-step adversarial training (AT) results in abrupt drops in the adversarial test accuracy (even down to 0%). For models trained with multi-step AT, it has been observed that the loss function behaves locally linearly with respect to the input, this is however lost in single-step AT. To address CO in single-step AT, several methods have been proposed to enforce local linearity of the loss via regularization. However, these regularization terms considerably slow down training due to Double Backpropagation. Instead, in this work, we introduce a regularization term, called ELLE, to mitigate CO effectively and efficiently in classical AT evaluations, as well as some more difficult regimes, e.g., large adversarial perturbations and long training schedules. Our regularization term can be theoretically linked to curvature of the loss function and is computationally cheaper than previous methods by avoiding Double Backpropagation. Our thorough experimental validation demonstrates that our work does not suffer from CO, even in challenging settings where previous works suffer from it. We also notice that adapting our regularization parameter during training (ELLE-A) greatly improves the performance, specially in large $\epsilon$ setups. Our implementation is available in

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the problem of Catastrophic Overfitting (CO) in single - step adversarial training (AT). Specifically: 1. **Background and challenges**: - In multi - step adversarial training, the loss function exhibits local linear characteristics on the input, but this characteristic is lost in single - step adversarial training. - Single - step adversarial training (such as FGSM) improves training efficiency, but it is prone to cause CO, which is manifested as a sharp drop in the adversarial test accuracy to 0%, while the single - step training adversarial accuracy rises to 100%. 2. **Limitations of existing methods**: - Existing methods enforce the local linearity of the loss function through regularization, but these methods usually involve Double Backpropagation, resulting in a significant slowdown in training speed. - These methods still cannot effectively avoid CO under large perturbation sizes (large ϵ) or long training schedules. 3. **The method proposed in the paper**: - The paper introduces a new regularization term ELLE (Efficient Local Linearity Enforcement) to effectively overcome CO. - ELLE avoids double backpropagation by enforcing the local linearity of the loss function, thereby greatly improving training efficiency. - In addition, an adaptive version ELLE - A is also proposed, which can dynamically adjust the regularization parameter according to the local linear error during the training process, further improving performance. 4. **Theoretical and experimental verification**: - Theoretically, ELLE is associated with the curvature of the loss function and can detect the occurrence of CO and avoid it. - Experimental results show that ELLE and ELLE - A not only perform excellently on standard benchmark datasets (such as CIFAR10, CIFAR100, SVHN and ImageNet), but also can effectively avoid CO in challenging settings. In conclusion, by proposing ELLE and its adaptive version ELLE - A, this paper provides an efficient and effective solution to the problem of Catastrophic Overfitting in single - step adversarial training.

Efficient local linearity regularization to overcome catastrophic overfitting

Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization

Layer-Aware Analysis of Catastrophic Overfitting: Revealing the Pseudo-Robust Shortcut Dependency

Catastrophic Overfitting: A Potential Blessing in Disguise

On Using Certified Training towards Empirical Robustness

Regularization properties of adversarially-trained linear regression

Overfitting in adversarially robust deep learning

Regularizing Deep Networks Using Efficient Layerwise Adversarial Training

Robust Single-step Adversarial Training with Regularizer

Alleviating Robust Overfitting of Adversarial Training With Consistency Regularization

Revisiting single-step adversarial training for robustness and generalization

Boosting Adversarial Training via Fisher-Rao Norm-based Regularization

$\ell_\infty$-Robustness and Beyond: Unleashing Efficient Adversarial Training

Understanding Catastrophic Overfitting in Single-step Adversarial Training

Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality

Regularizers for Single-step Adversarial Training

Enhancing Adversarial Robustness through Stable Adversarial Training

Preventing Catastrophic Overfitting in Fast Adversarial Training: A Bi-level Optimization Perspective

Precise Tradeoffs in Adversarial Training for Linear Regression

Reliably fast adversarial training via latent adversarial perturbation

Regularization for Adversarial Robust Learning