Abstract:Deep neural networks are proven to be vulnerable to fine-designed adversarial examples, and adversarial defense algorithms draw more and more attention nowadays. Pre-processing based defense is a major strategy, as well as learning robust feature representation has been proven an effective way to boost generalization. However, existing defense works lack considering different depth-level visual features in the training process. In this paper, we first highlight two novel properties of robust features from the feature distribution perspective: 1) \textbf{Diversity}. The robust feature of intra-class samples can maintain appropriate diversity; 2) \textbf{Discriminability}. The robust feature of inter-class samples should ensure adequate separation. We find that state-of-the-art defense methods aim to address both of these mentioned issues well. It motivates us to increase intra-class variance and decrease inter-class discrepancy simultaneously in adversarial training. Specifically, we propose a simple but effective defense based on decoupled visual representation masking. The designed Decoupled Visual Feature Masking (DFM) block can adaptively disentangle visual discriminative features and non-visual features with diverse mask strategies, while the suitable discarding information can disrupt adversarial noise to improve robustness. Our work provides a generic and easy-to-plugin block unit for any former adversarial training algorithm to achieve better protection integrally. Extensive experimental results prove the proposed method can achieve superior performance compared with state-of-the-art defense approaches. The code is publicly available at \href{<a class="link-external link-https" href="https://github.com/chenboluo/Adversarial-defense" rel="external noopener nofollow">this https URL</a>}{<a class="link-external link-https" href="https://github.com/chenboluo/Adversarial-defense" rel="external noopener nofollow">this https URL</a>}.

Improving Adversarial Robustness via Attention and Adversarial Logit Pairing

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

GAAT: Group Adaptive Adversarial Training to Improve the Trade-Off Between Robustness and Accuracy

A Universal Defense Strategy Against Adversarial Attacks Based on Attention-Guided

Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria

Attacking Adversarial Attacks as A Defense

Feature Augmentation for Adversarial Robustness

Improving Adversarial Robustness Requires Revisiting Misclassified Examples.

Improved Adversarial Training Through Adaptive Instance-wise Loss Smoothing

Improving Adversarial Robustness via Decoupled Visual Representation Masking

Feature Denoising for Improving Adversarial Robustness

Robust Superpixel-Guided Attentional Adversarial Attack

Adversarial robustness improvement for deep neural networks

Improving Machine Learning Robustness via Adversarial Training

Towards Robustness against Unsuspicious Adversarial Examples

Adversarial Attacks on ML Defense Models Competition

Defending Adversarial Attacks by Correcting Logits.

Self-adaptive logit balancing for deep neural network robustness: Defence and detection of adversarial attacks

Self-Adaptive Logit Balancing for Deep Learning Robustness in Computer Vision

Adversarial Training with Bi-directional Likelihood Regularization for Visual Classification

Global Adversarial Attacks for Assessing Deep Learning Robustness