Learning effective pruning at initialization from iterative pruning

Shengkai Liu,Yaofeng Cheng,Fusheng Zha,Wei Guo,Lining Sun,Zhenshan Bing,Chenguang Yang
2024-08-27
Abstract:Pruning at initialization (PaI) reduces training costs by removing weights before training, which becomes increasingly crucial with the growing network size. However, current PaI methods still have a large accuracy gap with iterative pruning, especially at high sparsity levels. This raises an intriguing question: can we get inspiration from iterative pruning to improve the PaI performance? In the lottery ticket hypothesis, the iterative rewind pruning (IRP) finds subnetworks retroactively by rewinding the parameter to the original initialization in every pruning iteration, which means all the subnetworks are based on the initial state. Here, we hypothesise the surviving subnetworks are more important and bridge the initial feature and their surviving score as the PaI criterion. We employ an end-to-end neural network (\textbf{AutoS}parse) to learn this correlation, input the model's initial features, output their score and then prune the lowest score parameters before training. To validate the accuracy and generalization of our method, we performed PaI across various models. Results show that our approach outperforms existing methods in high-sparsity settings. Notably, as the underlying logic of model pruning is consistent in different models, only one-time IRP on one model is needed (e.g., once IRP on ResNet-18/CIFAR-10, AutoS can be generalized to VGG-16/CIFAR-10, ResNet-18/TinyImageNet, et al.). As the first neural network-based PaI method, we conduct extensive experiments to validate the factors influencing this approach. These results reveal the learning tendencies of neural networks and provide new insights into our understanding and research of PaI from a practical perspective. Our code is available at: <a class="link-external link-https" href="https://github.com/ChengYaofeng/AutoSparse.git" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the performance gap problem encountered during Pruning at Initialization (PaI). Specifically, the current PaI methods have a large accuracy gap compared with Iterative Pruning at high sparsity. The authors propose a new method to improve the performance of PaI by getting inspiration from Iterative Pruning. #### Background and problem description 1. **Training cost and resource consumption**: - As the scale of neural networks increases, the computational resources and time required for training also increase significantly. Therefore, pruning before training can significantly reduce these resource consumptions. 2. **The gap between PaI and iterative pruning**: - Current PaI methods usually use hand - designed criteria to evaluate the importance of parameters and remove unimportant weights before training. However, these methods perform far worse than iterative pruning at high sparsity. 3. **The inspiration of the Lottery Ticket Hypothesis (LTH)**: - LTH shows that by Iteration Rewind Pruning (IRP), the parameters can be reset to their initial values after each pruning iteration, so as to find more effective sub - networks. This suggests that the surviving sub - networks may be more important and can be used as a standard for PaI. #### Proposed method To narrow the performance gap between PaI and iterative pruning, the authors propose an end - to - end neural network framework named AutoS. The framework is implemented through the following steps: 1. **Dataset generation**: - Use one - time IRP to generate a dataset containing initial parameters and gradients. 2. **Feature input and prediction**: - AutoS takes the initial features of the model (such as initial parameters and initial gradients) as input and outputs the importance score of each parameter. 3. **Pruning operation**: - According to the set sparsity level, prune the parameters with the lowest scores. #### Main contributions 1. **Propose a new PaI standard**: - Obtain the importance scores of parameters through Iteration Rewind Pruning and use them for PaI. 2. **Introduce the AutoS framework**: - Use an end - to - end neural network to automatically learn PaI criteria, replacing the traditional hand - designed methods. 3. **Extensive experimental verification**: - The experimental results show that AutoS is superior to existing methods at high sparsity and can be applied to different models with only one IRP. Through this method, the authors not only improve the accuracy of PaI, but also reveal the learning tendencies of neural networks, providing a new perspective for future PaI research.