Adversarial Pruning: A Survey and Benchmark of Pruning Methods for Adversarial Robustness

Giorgio Piras,Maura Pintor,Ambra Demontis,Battista Biggio,Giorgio Giacinto,Fabio Roli
2024-09-02
Abstract:Recent work has proposed neural network pruning techniques to reduce the size of a network while preserving robustness against adversarial examples, i.e., well-crafted inputs inducing a misclassification. These methods, which we refer to as adversarial pruning methods, involve complex and articulated designs, making it difficult to analyze the differences and establish a fair and accurate comparison. In this work, we overcome these issues by surveying current adversarial pruning methods and proposing a novel taxonomy to categorize them based on two main dimensions: the pipeline, defining when to prune; and the specifics, defining how to prune. We then highlight the limitations of current empirical analyses and propose a novel, fair evaluation benchmark to address them. We finally conduct an empirical re-evaluation of current adversarial pruning methods and discuss the results, highlighting the shared traits of top-performing adversarial pruning methods, as well as common issues. We welcome contributions in our publicly-available benchmark at <a class="link-external link-https" href="https://github.com/pralab/AdversarialPruningBenchmark" rel="external noopener nofollow">this https URL</a>
Machine Learning,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of reducing the model size while maintaining adversarial robustness in neural network pruning techniques. Specifically: 1. **Challenges of adversarial robustness**: Deep neural networks are vulnerable to adversarial examples, that is, by adding small perturbations to the input data, the model can be made to misclassify. Therefore, in safety - critical scenarios (such as self - driving cars, modern cybersecurity tasks, etc.), the model is required not only to be compressible but also to be robust against these attacks. 2. **Limitations of existing methods**: Existing Adversarial Pruning (AP) methods are complex and diverse in design, making it difficult to make a fair and accurate comparison. Different methods differ in pruning pipeline and pruning specifics, resulting in difficulties in systematically describing and evaluating the methods in the literature. 3. **Lack of a unified evaluation benchmark**: Current AP methods are inconsistent in experimental settings and evaluation criteria, making it difficult to directly compare the effects of different methods and also unable to determine which method performs best in practical applications. To solve these problems, the authors made the following contributions: - **Propose a new classification framework (taxonomy)**: Classify existing AP methods based on two main dimensions - pruning pipeline and pruning specifics. This helps to better understand the design characteristics of each method. - **Establish a unified evaluation benchmark**: Propose a new, fair evaluation benchmark to ensure that different AP methods are tested under the same experimental settings, thus obtaining reliable comparison results. Through these contributions, the author hopes to promote the research of AP methods towards a more systematic and comparable direction, providing a clear "blueprint" for future research. ### Formula summary - **Mathematical expression of the pruning problem**: \[ m^*=\arg\min_{\|m\|_0\leq k}L(\theta\otimes m,x,y) \] where \(\theta\) is the model parameter, \(L\) is the loss function, \((x,y)\sim D\) is the training data distribution, \(k\) is the limit on the number of pruned parameters, and \(m\) is a binary mask indicating whether a parameter is retained. - **Definition of sparsity rate**: \[ sr = 1-\frac{k}{|\theta|} \] where \(sr\) is the sparsity rate, \(k\) is the number of retained parameters, and \(|\theta|\) is the total number of original parameters. - **Optimization problem in adversarial training**: \[ \min_{\theta}\mathbb{E}_{(x,y)\sim D}\left[\max_{\|\delta\|_p\leq\epsilon}L(\theta,x + \delta,y)\right] \] where \(\delta\) is the adversarial perturbation, \(\epsilon\) is the upper limit of the perturbation, and \((x,y)\sim D\) is the training data distribution. Through these formulas, the author describes in detail the relationship between pruning and adversarial robustness and provides a theoretical basis for subsequent research.