Abstract:Diffusion models (DMs) have demonstrated great potential in the field of adversarial robustness, where DM-based defense methods can achieve superior defense capability without adversarial training. However, they all require huge computational costs due to the usage of large-scale pre-trained DMs, making it difficult to conduct full evaluation under strong attacks and compare with traditional CNN-based methods. Simply reducing the network size and timesteps in DMs could significantly harm the image generation quality, which invalidates previous frameworks. To alleviate this issue, we redesign the diffusion framework from generating high-quality images to predicting distinguishable image labels. Specifically, we employ an image translation framework to learn many-to-one mapping from input samples to designed orthogonal image labels. Based on this framework, we introduce an efficient Image-to-Image diffusion classifier with a pruned U-Net structure and reduced diffusion timesteps. Besides the framework, we redesign the optimization objective of DMs to fit the target of image classification, where a new classification loss is incorporated in the DM-based image translation framework to distinguish the generated label from those of other classes. We conduct sufficient evaluations of the proposed classifier under various attacks on popular benchmarks. Extensive experiments show that our method achieves better adversarial robustness with fewer computational costs than DM-based and CNN-based methods. The code is available at <a class="link-external link-https" href="https://github.com/hfmei/IDC" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper attempts to address the problem of improving the adversarial robustness of image classification models while maintaining efficiency. Specifically, existing methods based on diffusion models (DMs) perform well under adversarial attacks but are computationally expensive, making comprehensive evaluation and comparison with traditional convolutional neural network (CNN) methods difficult. The paper proposes a novel Image-to-Image Diffusion Classifier (IDC) that significantly reduces computational complexity and achieves better adversarial robustness by redesigning the diffusion framework to transform the high-quality image generation task into an image-label alignment task. ### Main Contributions: 1. **Proposed a novel Image-to-Image Diffusion Classifier (IDC)**: Transformed the image generation task into a classification task through predefined orthogonal image labels. 2. **Reduced the complexity of diffusion models**: Included pruning of network structures and reduction of diffusion steps, thereby lowering computational costs without compromising performance. 3. **Introduced classification loss**: Added classification loss during optimization to better adapt to the classification task. 4. **Extensive experimental validation**: Conducted numerous experiments on multiple benchmark datasets, demonstrating IDC's superior balance between adversarial robustness and model efficiency. ### Method Overview: - **Orthogonal Image Label Generation**: Generated orthogonal image labels using QR decomposition for the classification task. - **Image Label Translation**: Employed an image translation framework to translate input images into predefined image labels. - **Image-to-Image Classification**: Achieved classification by calculating the distance between generated image labels and predefined labels. - **Reduced Diffusion Complexity**: Significantly lowered computational costs by pruning the U-Net structure and reducing diffusion steps. - **Classification Optimization**: Introduced intra-class and inter-class losses to optimize the performance of the diffusion classifier. ### Experimental Results: - **CIFAR-10 Dataset**: IDC demonstrated significantly better robustness under various adversarial attacks compared to traditional CNN methods, achieving a good balance between standard accuracy and adversarial accuracy. - **CIFAR-100 Dataset**: IDC achieved comparable standard accuracy to traditional methods but significantly improved robustness under adversarial attacks. - **Adversarial Adaptive Attacks**: Under adaptive attacks such as BPDA+EOT and PGD+EOT, IDC outperformed existing DM and CNN methods with significantly reduced parameter counts. ### Conclusion: The paper proposes an efficient Image-to-Image Diffusion Classifier (IDC) that significantly reduces computational complexity and achieves better adversarial robustness by redesigning the diffusion framework. Experimental results show that IDC performs excellently across various datasets and adversarial attacks, indicating broad application prospects.

Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness

Robust Classification via a Single Diffusion Model

Struggle with Adversarial Defense? Try Diffusion

Robust Diffusion Models for Adversarial Purification

Lightweight Robust Image Classifier Using Non-Overlapping Image Compression Filters

An Efficient Framework for Enhancing Discriminative Models via Diffusion Techniques

Robust CLIP-Based Detector for Exposing Diffusion Model-Generated Images

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

Mitigating Adversarial Attacks in Object Detection through Conditional Diffusion Models

TrojDiff: Trojan Attacks on Diffusion Models with Diverse Targets

Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance

A Diffusion-Based Framework for Multi-Class Anomaly Detection

Pruning then Reweighting: Towards Data-Efficient Training of Diffusion Models

DiffDefense: Defending against Adversarial Attacks via Diffusion Models

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

DiffI2I: Efficient Diffusion Model for Image-to-Image Translation

Adversarial Robustification via Text-to-Image Diffusion Models

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

D3R-Net: Denoising Diffusion-Based Defense Restore Network for Adversarial Defense in Remote Sensing Scene Classification

Unveiling Universal Forensics of Diffusion Models with Adversarial Perturbations