Revisiting single-step adversarial training for robustness and generalization
Zhuorong Li,Daiwei Yu,Minghui Wu,Sixian Chan,Hongchuan Yu,Zhike Han
DOI: https://doi.org/10.1016/j.patcog.2024.110356
IF: 8
2024-02-28
Pattern Recognition
Abstract:Recently, single-step adversarial training has received high attention because it shows robustness and efficiency. However, a phenomenon referred to as "catastrophic overfitting" has been observed, which is prevalent in single-step defenses and may frustrate attempts to use FGSM adversarial training. To address this issue, we propose a novel method, S table and E fficient A dversarial T raining ( SEAT ). SEAT mitigates catastrophic overfitting by harnessing on local properties that differentiate a robust model from one prone to catastrophic overfitting. The proposed SEAT is underpinned by robust theoretical justifications, in that minimizing the SEAT loss is demonstrated to promote a smoother empirical risk, consequently enhancing robustness. Experimental results demonstrate that the proposed method successfully mitigates catastrophic overfitting, yielding superior performance amongst efficient defenses. Our single-step method can reach 51% robust accuracy for CIFAR-10 with l∞ perturbations of radius 8/255 under a strong PGD-50 attack, matching the performance of a 10-step iterative method at merely 3% computational cost.
computer science, artificial intelligence,engineering, electrical & electronic