Abstract:Adversarial training is known as one of the most effective heuristic defense methods. Unfortunately, most existing work focuses solely on image-space adversarial training, regardless of the exploration of complementary semantic space. Note that semantic space adversarial training is conducive to compensating for the deficiency of insufficient diversity of adversarial examples in pure image-space one, thereby facilitating the improvement of model robustness. On this account, it is sensible to learn from both adversarial images and features. Therefore, this paper proposes an image-semantic dual adversarial training framework (ISDAT) for the robustness enhancement of the classification model against multi-attacks. In the inner loop of ISDAT, to craft adversarial images as well as adversarial features, both the benign images and semantic features are perturbed through the image space path and semantic space path, respectively. Concerning attacking which intermediate layer of semantic features contributes most to improving the model’s anti-attack capability, we provide theoretical analysis for guidance, avoiding invalid neuron importance predictions and excessive computation. To ensure their respective contributions of adversarial images and features to model robustness, we advocate forging them with diverse loss views. In specific, we develop a C2 loss for adversarial feature generation involving semantic variance, aggressiveness, and high confidence. In the outer loop of ISDAT, to promote the model’s comprehensive understanding of both adversarial images and adversarial features, we give a joint image-semantic-guided model defense method. In specific, we develop an adversarial image-semantic perception loss (IS). Then, driven by this loss, we further establish an image-semantic end-to-end optimization process, which allows dual learning from both adversarial images and features. Experimental results on the CIFAR-10, CIFAR-100, and SVHN datasets demonstrate the effectiveness of our ISDAT in terms of defending against multiple both white-box and black-box attacks. The code will be available at https://github.com/flower6top.

ISDAT: an Image-Semantic Dual Adversarial Training Framework for Robust Image Classification

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Improving adversarial robustness of deep neural networks by using semantic information

Reconstruction-Assisted and Distance-Optimized Adversarial Training: A Defense Framework for Remote Sensing Scene Classification

Semantic Image Attack for Visual Model Diagnosis

Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic Segmentation

Semantically Consistent Visual Representation for Adversarial Robustness

Semantics-Preserving Adversarial Training

D2Defend: Dual-Domain based Defense against Adversarial Examples

CSFAdv: Critical Semantic Fusion Guided Least-Effort Adversarial Example Attacks

Adversarial Training of Deep Neural Networks Guided by Texture and Structural Information

Adversarial Image Generation and Training for Deep Convolutional Neural Networks.

Adversarially-Aware Robust Object Detector.

Attention-SA: Exploiting Model-approximated Data Semantics for Adversarial Attack

Understanding Object Detection Through An Adversarial Lens

Lightweight Robust Image Classifier Using Non-Overlapping Image Compression Filters

Defense against adversarial attacks based on color space transformation

Feature Denoising for Improving Adversarial Robustness

Edge Enhancement Improves Adversarial Robustness in Image Classification

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing

Mitigating Adversarial Attacks in Object Detection through Conditional Diffusion Models