ISDAT: an Image-Semantic Dual Adversarial Training Framework for Robust Image Classification

Chenhong Sui,Ao Wang,Haipeng Wang,Hao Liu,Qingtao Gong,Jing Yao,Danfeng Hong
DOI: https://doi.org/10.1016/j.patcog.2024.110968
2025-01-01
Abstract:Adversarial training is known as one of the most effective heuristic defense methods. Unfortunately, most existing work focuses solely on image-space adversarial training, regardless of the exploration of complementary semantic space. Note that semantic space adversarial training is conducive to compensating for the deficiency of insufficient diversity of adversarial examples in pure image-space one, thereby facilitating the improvement of model robustness. On this account, it is sensible to learn from both adversarial images and features. Therefore, this paper proposes an image-semantic dual adversarial training framework (ISDAT) for the robustness enhancement of the classification model against multi-attacks. In the inner loop of ISDAT, to craft adversarial images as well as adversarial features, both the benign images and semantic features are perturbed through the image space path and semantic space path, respectively. Concerning attacking which intermediate layer of semantic features contributes most to improving the model’s anti-attack capability, we provide theoretical analysis for guidance, avoiding invalid neuron importance predictions and excessive computation. To ensure their respective contributions of adversarial images and features to model robustness, we advocate forging them with diverse loss views. In specific, we develop a C2 loss for adversarial feature generation involving semantic variance, aggressiveness, and high confidence. In the outer loop of ISDAT, to promote the model’s comprehensive understanding of both adversarial images and adversarial features, we give a joint image-semantic-guided model defense method. In specific, we develop an adversarial image-semantic perception loss (IS). Then, driven by this loss, we further establish an image-semantic end-to-end optimization process, which allows dual learning from both adversarial images and features. Experimental results on the CIFAR-10, CIFAR-100, and SVHN datasets demonstrate the effectiveness of our ISDAT in terms of defending against multiple both white-box and black-box attacks. The code will be available at https://github.com/flower6top.
What problem does this paper attempt to address?