Transferable Learned Image Compression-Resistant Adversarial Perturbations

Yang Sui,Zhuohang Li,Ding Ding,Xiang Pan,Xiaozhong Xu,Shan Liu,Zhenzhong Chen
2024-01-06
Abstract:Adversarial attacks can readily disrupt the image classification system, revealing the vulnerability of DNN-based recognition tasks. While existing adversarial perturbations are primarily applied to uncompressed images or compressed images by the traditional image compression method, i.e., JPEG, limited studies have investigated the robustness of models for image classification in the context of DNN-based image compression. With the rapid evolution of advanced image compression, DNN-based learned image compression has emerged as the promising approach for transmitting images in many security-critical applications, such as cloud-based face recognition and autonomous driving, due to its superior performance over traditional compression. Therefore, there is a pressing need to fully investigate the robustness of a classification system post-processed by learned image compression. To bridge this research gap, we explore the adversarial attack on a new pipeline that targets image classification models that utilize learned image compressors as pre-processing modules. Furthermore, to enhance the transferability of perturbations across various quality levels and architectures of learned image compression models, we introduce a saliency score-based sampling method to enable the fast generation of transferable perturbation. Extensive experiments with popular attack methods demonstrate the enhanced transferability of our proposed method when attacking images that have been post-processed with different learned image compression models.
Computer Vision and Pattern Recognition,Multimedia,Image and Video Processing
What problem does this paper attempt to address?
The paper mainly addresses the following issues: 1. **Research Background and Objectives**: With the widespread application of deep neural networks (DNN) in tasks such as image classification, adversarial attacks have become a significant threat. These attacks mislead DNN models by adding small but carefully designed perturbations, causing incorrect predictions. Although existing adversarial attacks mainly target uncompressed or traditionally JPEG-compressed images, there is little research on the robustness of image classification systems under Learned Image Compression (LIC). 2. **Specific Problem**: The paper aims to explore the robustness of image classification systems based on LIC (LICCS) against adversarial perturbations and evaluate the transferability of these perturbations, especially between LIC models of different quality and architecture. 3. **Solutions**: - **Adversarial Attack Pipeline**: An adversarial attack pipeline targeting LICCS is proposed, utilizing LIC as a preprocessing module for the image classification model. - **Robustness Evaluation**: White-box attack experiments were first conducted to evaluate the robustness of LICCS, and the results showed that LICCS is relatively vulnerable to such attacks. - **Transferability Exploration**: The transferability of perturbations in black-box attack scenarios was further explored, revealing that models of adjacent quality levels are more susceptible. - **Improvement Scheme**: To enhance the transferability of perturbations, a saliency score-based sampling method was introduced, selecting the most influential LIC quality level combinations for joint attacks, thereby achieving effective transfer across different qualities and architectures. 4. **Contribution Summary**: - Investigated the adversarial attack pipeline against LICCS, which, to the best of the authors' knowledge, is the first robustness study of such systems. - Demonstrated through a series of experiments the robustness of LICCS in both white-box and black-box attack scenarios and the transferability of perturbations. - Proposed a saliency score-based sampling method to generate perturbations with better transferability, which can work effectively even under limited model access conditions. In summary, this paper comprehensively explores the robustness and transferability of perturbations in image classification systems based on LIC against adversarial attacks through theoretical analysis, experimental validation, and the proposal of new methods, and provides effective improvement measures.