Abstract:Deep neural networks are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on clean inputs. Although many attack methods can achieve high success rates in the white-box setting, they also exhibit weak transferability in the black-box setting. Recently, various methods have been proposed to improve adversarial transferability, in which the input transformation is one of the most effective methods. In this work, we notice that existing input transformation-based works mainly adopt the transformed data in the same domain for augmentation. Inspired by domain generalization, we aim to further improve the transferability using the data augmented from different domains. Specifically, a style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans. Hence, we propose a novel attack method named Style Transfer Method (STM) that utilizes a proposed arbitrary style transfer network to transform the images into different domains. To avoid inconsistent semantic information of stylized images for the classification network, we fine-tune the style transfer network and mix up the generated images added by random noise with the original images to maintain semantic consistency and boost input diversity. Extensive experimental results on the ImageNet-compatible dataset show that our proposed method can significantly improve the adversarial transferability on either normally trained models or adversarially trained models than state-of-the-art input transformation-based attacks. Code is available at: <a class="link-external link-https" href="https://github.com/Zhijin-Ge/STM" rel="external noopener nofollow">this https URL</a>.

RL-VAEGAN: Adversarial Defense for Reinforcement Learning Agents Via Style Transfer.

Natural Black-Box Adversarial Examples Against Deep Reinforcement Learning.

Robust Deep Reinforcement Learning with Adversarial Attacks

AdversarialStyle: GAN Based Style Guided Verification Framework for Deep Learning Systems

Transferable Adversarial Attacks on Deep Reinforcement Learning with Domain Randomization

Mitigating Adversarial Perturbations for Deep Reinforcement Learning via Vector Quantization

Towards Governing Agent's Efficacy: Action-Conditional $β$-VAE for Deep Transparent Reinforcement Learning

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Robustifying Reinforcement Learning Agents via Action Space Adversarial Training

Controlling Neural Style Transfer with Deep Reinforcement Learning

Adversary Agnostic Robust Deep Reinforcement Learning

Adversarial Policies: Attacking Deep Reinforcement Learning

On the Perturbed States for Transformed Input-robust Reinforcement Learning

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

Attacking Visually-aware Recommender Systems with Transferable and Imperceptible Adversarial Styles

Attacking and Defending Deep Reinforcement Learning Policies

Towards Deep Learning Models Resistant to Transfer-based Adversarial Attacks via Data-centric Robust Learning

Towards Secure Multi-Agent Deep Reinforcement Learning: Adversarial Attacks and Countermeasures

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

Multiple-Model Based Defense for Deep Reinforcement Learning Against Adversarial Attack

Improving the Transferability of Adversarial Examples with Arbitrary Style Transfer