Parameter-constrained adversarial training

Zhengjie Deng,Yufan Wei
DOI: https://doi.org/10.1109/CBASE60015.2023.10439127
2023-11-03
Abstract:Adversarial training is a simple and effective approach to defend against adversarial attacks. However, most adversarial training methods face expensive time and computational costs. To enhance the efficiency of adversarial training, single-step adversarial training (SSAT) employs the Fast Gradient Sign Method (FGSM) to generate adversarial examples. However, single-step adversarial training suffers from a severe problem known as catastrophic overfitting (CO), where the model may achieve 0% accuracy when confronted with adversarial examples generated using Projected Gradient Descent (PGD) during training. In this work, we propose parameter-constrained adversarial training (PCAT). We improve the initial perturbations in FGSM-RS to make adversarial examples more effective. We observe the difference of the model gradient between before and after the input adversarial sample, so selectively freeze some layers in subsequent training to prevent the adversarial information from interfering with the model. Extensive experiments demonstrate that our approach can eliminate CO and further enhance the model’s robustness against strong adversaries.
Computer Science
What problem does this paper attempt to address?