Abstract:Deep neural networks are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on clean inputs. Although many attack methods can achieve high success rates in the white-box setting, they also exhibit weak transferability in the black-box setting. Recently, various methods have been proposed to improve adversarial transferability, in which the input transformation is one of the most effective methods. In this work, we notice that existing input transformation-based works mainly adopt the transformed data in the same domain for augmentation. Inspired by domain generalization, we aim to further improve the transferability using the data augmented from different domains. Specifically, a style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans. Hence, we propose a novel attack method named Style Transfer Method (STM) that utilizes a proposed arbitrary style transfer network to transform the images into different domains. To avoid inconsistent semantic information of stylized images for the classification network, we fine-tune the style transfer network and mix up the generated images added by random noise with the original images to maintain semantic consistency and boost input diversity. Extensive experimental results on the ImageNet-compatible dataset show that our proposed method can significantly improve the adversarial transferability on either normally trained models or adversarially trained models than state-of-the-art input transformation-based attacks. Code is available at: <a class="link-external link-https" href="https://github.com/Zhijin-Ge/STM" rel="external noopener nofollow">this https URL</a>.

Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer

Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks

UATST: Towards Unpaired Arbitrary Text-Guided Style Transfer with Cross-Space Modulation

Improving the Transferability of Adversarial Examples with Arbitrary Style Transfer

StyLess: Boosting the Transferability of Adversarial Examples

Style Transfer in Text: Exploration and Evaluation

Towards Evaluating the Robustness of Automatic Speech Recognition Systems via Audio Style Transfer

StyleFool: Fooling Video Classification Systems via Style Transfer

ST$^2$: Small-data Text Style Transfer via Multi-task Meta-Learning

Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation

Textual Adversarial Attack As Combinatorial Optimization

Visual Attack and Defense on Text

Text Adversarial Attacks and Defenses: Issues, Taxonomy, and Perspectives

Low Resource Style Transfer Via Domain Adaptive Meta Learning

Style Transfer as Unsupervised Machine Translation

MSSRNet: Manipulating Sequential Style Representation for Unsupervised Text Style Transfer

Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations

Text Style Transfer Via Learning Style Instance Supported Latent Space

Cycle-Consistent Adversarial Autoencoders for Unsupervised Text Style Transfer

Towards a Robust Deep Neural Network Against Adversarial Texts: A Survey.

Towards a Robust Deep Neural Network in Texts: A Survey