Abstract:Recent advances in deep neural network (DNN) techniques have increased the importance of security and robustness of algorithms where DNNs are applied. However, several studies have demonstrated that neural networks are vulnerable to adversarial examples, which are generated by adding crafted adversarial noises to the input images. Because the adversarial noises are typically imperceptible to the human eye, it is difficult to defend DNNs. One method of defense is the detection of adversarial examples by analyzing characteristics of input images. Recent studies have used the hidden layer outputs of the target classifier to improve the robustness but need to access the target classifier. Moreover, there is no post-processing step for the detected adversarial examples. They simply discard the detected adversarial images. To resolve this problem, we propose a novel detection-based method, which predicts the adversarial noise and detects the adversarial example based on the predicted noise without any target classification information. We first generated adversarial examples and adversarial noises, which can be obtained from the residual between the original and adversarial example images. Subsequently, we trained the proposed adversarial noise predictor to estimate the adversarial noise image and trained the adversarial detector using the input images and the predicted noises. The proposed framework has the advantage that it is agnostic to the input image modality. Moreover, the predicted noises can be used to reconstruct the detected adversarial examples as the non-adversarial images instead of discarding the detected adversarial examples. We tested our proposed method against the fast gradient sign method (FGSM), basic iterative method (BIM), projected gradient descent (PGD), Deepfool, and Carlini & Wagner adversarial attack methods on the CIFAR-10 and CIFAR-100 datasets provided by the Canadian Institute for Advanced Research (CIFAR). Our method demonstrated significant improvements in detection accuracy when compared to the state-of-the-art methods and resolved the wastage problem of the detected adversarial examples. The proposed method agnostic to the input image modality demonstrated that the noise predictor successfully captured noise in the Fourier domain and improved the performance of the detection task. Moreover, we resolved the post-processing problem of the detected adversarial examples with the reconstruction process using the predicted noise.

Defense Against Adversarial Attacks via Adversarial Noise Denoising Networks in Image Recognition

Image denoising algorithm based on adversarial learning using joint loss function

Spatial-Adaptive Network for Single Image Denoising

Denoising Adversarial Networks for Rain Removal and Reflection Removal.

A Neural Network Based Low-Light Image Denoising Method

LDN-RC: a Lightweight Denoising Network with Residual Connection to Improve Adversarial Robustness

Feature Denoising for Improving Adversarial Robustness

D3R-Net: Denoising Diffusion-Based Defense Restore Network for Adversarial Defense in Remote Sensing Scene Classification

Defense Against Adversarial Attacks Using High-Level Representation Guided Denoiser

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Defense against adversarial attacks on deep convolutional neural networks through nonlocal denoising

Detecting adversarial samples by noise injection and denoising

Noise Sensitivity-Based Energy Efficient and Robust Adversary Detection in Neural Networks

A hybrid adversarial training for deep learning model and denoising network resistant to adversarial examples

Evaluating Similitude and Robustness of Deep Image Denoising Models via Adversarial Attack

Attention-guided CNN for image denoising

Detect and defense against adversarial examples in deep learning using natural scene statistics and adaptive denoising

Pasadena: Perceptually Aware and Stealthy Adversarial Denoise Attack

Adversarial example detection by predicting adversarial noise in the frequency domain

Dual Adversarial Network: Toward Real-world Noise Removal and Noise Generation

Feature decoupling and interaction network for defending against adversarial examples