Abstract:Adversarial training is known as one of the most effective heuristic defense methods. Unfortunately, most existing work focuses solely on image-space adversarial training, regardless of the exploration of complementary semantic space. Note that semantic space adversarial training is conducive to compensating for the deficiency of insufficient diversity of adversarial examples in pure image-space one, thereby facilitating the improvement of model robustness. On this account, it is sensible to learn from both adversarial images and features. Therefore, this paper proposes an image-semantic dual adversarial training framework (ISDAT) for the robustness enhancement of the classification model against multi-attacks. In the inner loop of ISDAT, to craft adversarial images as well as adversarial features, both the benign images and semantic features are perturbed through the image space path and semantic space path, respectively. Concerning attacking which intermediate layer of semantic features contributes most to improving the model’s anti-attack capability, we provide theoretical analysis for guidance, avoiding invalid neuron importance predictions and excessive computation. To ensure their respective contributions of adversarial images and features to model robustness, we advocate forging them with diverse loss views. In specific, we develop a C2 loss for adversarial feature generation involving semantic variance, aggressiveness, and high confidence. In the outer loop of ISDAT, to promote the model’s comprehensive understanding of both adversarial images and adversarial features, we give a joint image-semantic-guided model defense method. In specific, we develop an adversarial image-semantic perception loss (IS). Then, driven by this loss, we further establish an image-semantic end-to-end optimization process, which allows dual learning from both adversarial images and features. Experimental results on the CIFAR-10, CIFAR-100, and SVHN datasets demonstrate the effectiveness of our ISDAT in terms of defending against multiple both white-box and black-box attacks. The code will be available at https://github.com/flower6top.

Improving adversarial robustness of deep neural networks by using semantic information

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Analyzing Adversarial Robustness of Deep Neural Networks in Pixel Space: a Semantic Perspective

Toward Adversarial Robustness via Semi-supervised Robust Training

Adversarial robustness improvement for deep neural networks

Semantically Consistent Visual Representation for Adversarial Robustness

Improving Model Robustness Against Adversarial Examples with Redundant Fully Connected Layer.

Improving the Robustness of Deep Convolutional Neural Networks Through Feature Learning

A general approach to improve adversarial robustness of DNNs for medical image segmentation and detection

DeepDefense: Training Deep Neural Networks with Improved Robustness.

Semi-supervised Robust Training with Generalized Perturbed Neighborhood

Improving Adversarial Robustness Requires Revisiting Misclassified Examples.

Towards Robustness against Unsuspicious Adversarial Examples

SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing

Progressive Diversified Augmentation for General Robustness of DNNs: A Unified Approach.

ISDAT: an Image-Semantic Dual Adversarial Training Framework for Robust Image Classification

Exploring Robust Features for Improving Adversarial Robustness

Mitigating Adversarial Attacks for Deep Neural Networks by Input Deformation and Augmentation

Improving Adversarial Robustness via Attention and Adversarial Logit Pairing

Deep Defense: Training DNNs with Improved Adversarial Robustness

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing