Efficient local linearity regularization to overcome catastrophic overfitting

Elias Abad Rocamora,Fanghui Liu,Grigorios G. Chrysos,Pablo M. Olmos,Volkan Cevher
2024-02-29
Abstract:Catastrophic overfitting (CO) in single-step adversarial training (AT) results in abrupt drops in the adversarial test accuracy (even down to 0%). For models trained with multi-step AT, it has been observed that the loss function behaves locally linearly with respect to the input, this is however lost in single-step AT. To address CO in single-step AT, several methods have been proposed to enforce local linearity of the loss via regularization. However, these regularization terms considerably slow down training due to Double Backpropagation. Instead, in this work, we introduce a regularization term, called ELLE, to mitigate CO effectively and efficiently in classical AT evaluations, as well as some more difficult regimes, e.g., large adversarial perturbations and long training schedules. Our regularization term can be theoretically linked to curvature of the loss function and is computationally cheaper than previous methods by avoiding Double Backpropagation. Our thorough experimental validation demonstrates that our work does not suffer from CO, even in challenging settings where previous works suffer from it. We also notice that adapting our regularization parameter during training (ELLE-A) greatly improves the performance, specially in large $\epsilon$ setups. Our implementation is available in
Machine Learning,Artificial Intelligence,Cryptography and Security
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of Catastrophic Overfitting (CO) in single - step adversarial training (AT). Specifically: 1. **Background and challenges**: - In multi - step adversarial training, the loss function exhibits local linear characteristics on the input, but this characteristic is lost in single - step adversarial training. - Single - step adversarial training (such as FGSM) improves training efficiency, but it is prone to cause CO, which is manifested as a sharp drop in the adversarial test accuracy to 0%, while the single - step training adversarial accuracy rises to 100%. 2. **Limitations of existing methods**: - Existing methods enforce the local linearity of the loss function through regularization, but these methods usually involve Double Backpropagation, resulting in a significant slowdown in training speed. - These methods still cannot effectively avoid CO under large perturbation sizes (large ϵ) or long training schedules. 3. **The method proposed in the paper**: - The paper introduces a new regularization term ELLE (Efficient Local Linearity Enforcement) to effectively overcome CO. - ELLE avoids double backpropagation by enforcing the local linearity of the loss function, thereby greatly improving training efficiency. - In addition, an adaptive version ELLE - A is also proposed, which can dynamically adjust the regularization parameter according to the local linear error during the training process, further improving performance. 4. **Theoretical and experimental verification**: - Theoretically, ELLE is associated with the curvature of the loss function and can detect the occurrence of CO and avoid it. - Experimental results show that ELLE and ELLE - A not only perform excellently on standard benchmark datasets (such as CIFAR10, CIFAR100, SVHN and ImageNet), but also can effectively avoid CO in challenging settings. In conclusion, by proposing ELLE and its adaptive version ELLE - A, this paper provides an efficient and effective solution to the problem of Catastrophic Overfitting in single - step adversarial training.