Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness

Hefei Mei,Minjing Dong,Chang Xu
2024-08-16
Abstract:Diffusion models (DMs) have demonstrated great potential in the field of adversarial robustness, where DM-based defense methods can achieve superior defense capability without adversarial training. However, they all require huge computational costs due to the usage of large-scale pre-trained DMs, making it difficult to conduct full evaluation under strong attacks and compare with traditional CNN-based methods. Simply reducing the network size and timesteps in DMs could significantly harm the image generation quality, which invalidates previous frameworks. To alleviate this issue, we redesign the diffusion framework from generating high-quality images to predicting distinguishable image labels. Specifically, we employ an image translation framework to learn many-to-one mapping from input samples to designed orthogonal image labels. Based on this framework, we introduce an efficient Image-to-Image diffusion classifier with a pruned U-Net structure and reduced diffusion timesteps. Besides the framework, we redesign the optimization objective of DMs to fit the target of image classification, where a new classification loss is incorporated in the DM-based image translation framework to distinguish the generated label from those of other classes. We conduct sufficient evaluations of the proposed classifier under various attacks on popular benchmarks. Extensive experiments show that our method achieves better adversarial robustness with fewer computational costs than DM-based and CNN-based methods. The code is available at <a class="link-external link-https" href="https://github.com/hfmei/IDC" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the problem of improving the adversarial robustness of image classification models while maintaining efficiency. Specifically, existing methods based on diffusion models (DMs) perform well under adversarial attacks but are computationally expensive, making comprehensive evaluation and comparison with traditional convolutional neural network (CNN) methods difficult. The paper proposes a novel Image-to-Image Diffusion Classifier (IDC) that significantly reduces computational complexity and achieves better adversarial robustness by redesigning the diffusion framework to transform the high-quality image generation task into an image-label alignment task. ### Main Contributions: 1. **Proposed a novel Image-to-Image Diffusion Classifier (IDC)**: Transformed the image generation task into a classification task through predefined orthogonal image labels. 2. **Reduced the complexity of diffusion models**: Included pruning of network structures and reduction of diffusion steps, thereby lowering computational costs without compromising performance. 3. **Introduced classification loss**: Added classification loss during optimization to better adapt to the classification task. 4. **Extensive experimental validation**: Conducted numerous experiments on multiple benchmark datasets, demonstrating IDC's superior balance between adversarial robustness and model efficiency. ### Method Overview: - **Orthogonal Image Label Generation**: Generated orthogonal image labels using QR decomposition for the classification task. - **Image Label Translation**: Employed an image translation framework to translate input images into predefined image labels. - **Image-to-Image Classification**: Achieved classification by calculating the distance between generated image labels and predefined labels. - **Reduced Diffusion Complexity**: Significantly lowered computational costs by pruning the U-Net structure and reducing diffusion steps. - **Classification Optimization**: Introduced intra-class and inter-class losses to optimize the performance of the diffusion classifier. ### Experimental Results: - **CIFAR-10 Dataset**: IDC demonstrated significantly better robustness under various adversarial attacks compared to traditional CNN methods, achieving a good balance between standard accuracy and adversarial accuracy. - **CIFAR-100 Dataset**: IDC achieved comparable standard accuracy to traditional methods but significantly improved robustness under adversarial attacks. - **Adversarial Adaptive Attacks**: Under adaptive attacks such as BPDA+EOT and PGD+EOT, IDC outperformed existing DM and CNN methods with significantly reduced parameter counts. ### Conclusion: The paper proposes an efficient Image-to-Image Diffusion Classifier (IDC) that significantly reduces computational complexity and achieves better adversarial robustness by redesigning the diffusion framework. Experimental results show that IDC performs excellently across various datasets and adversarial attacks, indicating broad application prospects.