Abstract:Deep learning has shown superiority in dealing with complicated and professional tasks (e.g., computer vision, audio, and language processing). However, research works have confirmed that Deep Neural Networks (DNNs) are vulnerable to carefully crafted adversarial perturbations, which cause DNNs confusion on specific tasks. In object detection domain, the background has little contributions to object classification, and the crafted adversarial perturbations added to the background do not improve the adversary effect in fooling deep neural detection models yet induce substantial distortions in generated examples. Based on such situation, we introduce an adversarial attack algorithm named Adaptive Object-oriented Adversarial Method (AO 2 AM). It aims to fool deep neural object detection networks with the adversarial examples by applying the adaptive cumulation of object-based gradients and adding the adaptive object-based adversarial perturbations merely onto objects rather than the whole frame of input images. AO 2 AM can effectively make the representations of generated adversarial samples close to the decision boundary in the latent space, and force deep neural detection networks to yield inaccurate locations and false classification in the process of object detection. Compared with existing adversarial attack methods which generate adversarial perturbations acting on the global scale of the original inputs, the adversarial examples produced by AO 2 AM can effectively fool deep neural object detection networks and maintain a high structural similarity with corresponding clean inputs. Performing adversarial attacks on Faster R-CNN, AO 2 AM gains attack success rate (ASR) over 98.00% on pre-processed Pascal VOC 2007&2012 (Val), and reaches SSIM over 0.870. In Fooling SSD, AO 2 AM receives SSIM exceeding 0.980 on L 2 norm constraint. On SSIM and Mean Attack Ratio, AO 2 AM outperforms adversarial attack methods based on global scale perturbations.

Fooling the Textual Fooler via Randomizing Latent Representations

Fooling Neural Network Interpretations - Adversarial Noise to Attack Images.

Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment

AdvFoolGen: Creating Persistent Troubles for Deep Classifiers

TextJuggler: Fooling Text Classification Tasks by Generating High-Quality Adversarial Examples

TransFool: An Adversarial Attack against Neural Machine Translation Models

DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks

Revisiting DeepFool: generalization and improvement

Rethinking Textual Adversarial Defense for Pre-trained Language Models

Defense of Word-level Adversarial Attacks via Random Substitution Encoding

Fooling Examples: Another Intriguing Property of Neural Networks

Fooling deep neural detection networks with adaptive object-oriented adversarial perturbation

Frauds Bargain Attack: Generating Adversarial Text Samples via Word Manipulation Process

Attacking Adversarial Attacks as A Defense

An LLM can Fool Itself: A Prompt-Based Adversarial Attack

FoolChecker: A platform to evaluate the robustness of images against adversarial attacks

Fooling OCR Systems with Adversarial Text Images

TSFool: Crafting Highly-Imperceptible Adversarial Time Series through Multi-Objective Attack

Fooling Explanations in Text Classifiers

Universal Rules for Fooling Deep Neural Networks based Text Classification

FastWordBug: A Fast Method To Generate Adversarial Text Against NLP Applications