Abstract:Adversarial attacks meticulously generate minuscule, imperceptible perturbations to images to deceive neural networks. Counteracting these, adversarial purification methods seek to transform adversarial input samples into clean output images to defend against adversarial attacks. Nonetheless, extent generative models fail to effectively eliminate adversarial perturbations, yielding less-than-ideal purification results. We emphasize the potential threat of residual adversarial perturbations to target models, quantitatively establishing a relationship between perturbation scale and attack capability. Notably, the residual perturbations on the purified image primarily stem from the same-position patch and similar patches of the adversarial sample. We propose a novel adversarial purification approach named Information Mask Purification (IMPure), aims to extensively eliminate adversarial perturbations. To obtain an adversarial sample, we first mask part of the patches information, then reconstruct the patches to resist adversarial perturbations from the patches. We reconstruct all patches in parallel to obtain a cohesive image. Then, in order to protect the purified samples against potential similar regional perturbations, we simulate this risk by randomly mixing the purified samples with the input samples before inputting them into the feature extraction network. Finally, we establish a combined constraint of pixel loss and perceptual loss to augment the model's reconstruction adaptability. Extensive experiments on the ImageNet dataset with three classifier models demonstrate that our approach achieves state-of-the-art results against nine adversarial attack methods. Implementation code and pre-trained weights can be accessed at \textcolor{blue}{<a class="link-external link-https" href="https://github.com/NoWindButRain/IMPure" rel="external noopener nofollow">this https URL</a>}.

AID-Purifier: A Light Auxiliary Network for Boosting Adversarial Defense

Adversarial Training on Purification (AToP): Advancing Both Robustness and Generalization

Purify++: Improving Diffusion-Purification with Advanced Diffusion Models and Control of Randomness

Adversarial Purification of Information Masking

Classifier Guidance Enhances Diffusion-based Adversarial Purification by Preserving Predictive Information

NCIS: Neural Contextual Iterative Smoothing for Purifying Adversarial Perturbations

ZeroPur: Succinct Training-Free Adversarial Purification

Guided Diffusion Model for Adversarial Purification

ADBM: Adversarial diffusion bridge model for reliable adversarial purification

A Universal Defense Strategy Against Adversarial Attacks Based on Attention-Guided

Randomized Purifier Based on Low Adversarial Transferability for Adversarial Defense

Robust Evaluation of Diffusion-Based Adversarial Purification

Language Guided Adversarial Purification

Purifier: Defending Data Inference Attacks via Transforming Confidence Scores

PuVAE: A Variational Autoencoder to Purify Adversarial Examples

PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning

Text Adversarial Purification As Defense Against Adversarial Attacks

LightPure: Realtime Adversarial Image Purification for Mobile Devices Using Diffusion Models

Instant Adversarial Purification with Adversarial Consistency Distillation

LoRID: Low-Rank Iterative Diffusion for Adversarial Purification

Iterative Window Mean Filter: Thwarting Diffusion-based Adversarial Purification