RDAT: an efficient regularized decoupled adversarial training mechanism
Yishan Li,Yanming Guo,Yulun Wu,Yuxiang Xie,Mingrui Lao,Tianyuan Yu,Yirun Ruan
DOI: https://doi.org/10.1007/s13735-024-00330-y
2024-05-08
International Journal of Multimedia Information Retrieval
Abstract:Adversarial examples have exposed the inherent vulnerabilities of deep neural networks. Although adversarial training has emerged as the leading strategy for adversarial defenses, it is frequently hindered by a challenging balance between maintaining accuracy on unaltered examples and enhancing model robustness. Recent efforts on decoupling network components can effectively reduce the degradation of classification accuracy, but at the cost of an unsatisfactory in robust accuracy, and may suffer from robust overfitting. In this paper, we delve into the underlying causes of this compromise, and introduce a novel framework, the Regularized Decoupled Adversarial Training Mechanism (RDAT) to effectively deal with the trade-off and overfitting. Specifically, RDAT comprises two distinct modules: Regularization module mitigates harmful perturbations by controlling the data distribution distance of examples before and after adversarial attacks. Decoupling training module separates clean and adversarial examples so that they can have special optimization strategies to avoid the suboptimal result in adversarial training. With marginal compromise on the classification accuracy, RDAT achieves remarkably better model robustness with the improvement of robust accuracy by an average of 4.47% on CIFAR-10 and 3.23% on CIFAR-100 when compared to state-of-the-art methods.
computer science, artificial intelligence, software engineering