Hardening DNNs against Transfer Attacks during Network Compression using Greedy Adversarial Pruning

Jonah O'Brien Weiss,Tiago Alves,Sandip Kundu
DOI: https://doi.org/10.48550/arXiv.2206.07406
2022-06-15
Abstract:The prevalence and success of Deep Neural Network (DNN) applications in recent years have motivated research on DNN compression, such as pruning and quantization. These techniques accelerate model inference, reduce power consumption, and reduce the size and complexity of the hardware necessary to run DNNs, all with little to no loss in accuracy. However, since DNNs are vulnerable to adversarial inputs, it is important to consider the relationship between compression and adversarial robustness. In this work, we investigate the adversarial robustness of models produced by several irregular pruning schemes and by 8-bit quantization. Additionally, while conventional pruning removes the least important parameters in a DNN, we investigate the effect of an unconventional pruning method: removing the most important model parameters based on the gradient on adversarial inputs. We call this method Greedy Adversarial Pruning (GAP) and we find that this pruning method results in models that are resistant to transfer attacks from their uncompressed counterparts.
Machine Learning,Cryptography and Security
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to enhance the robustness of deep neural networks (DNN) against transfer attacks during the neural network compression process**. Specifically, the paper focuses on the following two aspects of problems: 1. **The relationship between DNN compression and adversarial robustness**: - With the application of DNN compression techniques (such as pruning and quantization), the size and complexity of the model are reduced, the inference speed is increased, and the power consumption is decreased. However, these compression techniques may change the decision boundaries of the model, thus affecting its adversarial robustness. - Researchers need to understand the impact of different compression methods (especially pruning methods) on the model's adversarial robustness to ensure that the compressed model not only maintains high accuracy but also can resist the attacks of adversarial samples. 2. **Proposing a new pruning method - Greedy Adversarial Pruning (GAP)**: - Traditional pruning methods usually remove the least important parameters in the network, while GAP does the opposite and removes those parameters with the largest gradients of adversarial samples. In this way, GAP aims to eliminate the DNN's misclassification ability for known adversarial samples. - The paper verifies whether GAP can improve the model's robustness against adversarial samples generated from the uncompressed model, that is, the robustness against transfer attacks. ### Main contributions - **Experimental analysis**: The paper analyzes in detail the impact of several regular and irregular pruning methods and their quantization versions on the DNN's adversarial robustness. - **Introduction of GAP**: Proposes Greedy Adversarial Pruning (GAP) and verifies its performance under different compression ratios. - **Result discovery**: Although the models generated by GAP are still vulnerable to the adversarial samples generated by themselves, they show high robustness against the adversarial samples generated from the uncompressed model, indicating that GAP has changed the model's decision boundaries. ### Formula summary - Pruning mask generation formula: \[ M_n = \begin{cases} 0, & s(\theta_n) < \gamma \\ 1, & \text{else} \end{cases} \] where \( s(\cdot) \) represents the parameter importance scoring function, and \( \gamma \) is the percentile of importance. - Gradient pruning scoring function: \[ s(\theta_n) = \sum_{(x, y) \in \mathcal{D}} |\nabla_{\theta_n} L(\theta, x, y)| \] - GAP scoring function: \[ s(\theta_n) = \sum_{(x, y) \in \mathcal{D}} -\nabla_{\theta_n} L(\theta, x_{\text{adv}}, y) \] where \( x_{\text{adv}} \) is the adversarial sample generated from the unperturbed input \( x \). Through these studies, the paper provides new perspectives and methods for understanding and enhancing the adversarial robustness of compressed DNNs.