Neural Network Pruning by Gradient Descent

Zhang Zhang,Ruyi Tao,Jiang Zhang
2023-11-22
Abstract:The rapid increase in the parameters of deep learning models has led to significant costs, challenging computational efficiency and model interpretability. In this paper, we introduce a novel and straightforward neural network pruning framework that incorporates the Gumbel-Softmax technique. This framework enables the simultaneous optimization of a network's weights and topology in an end-to-end process using stochastic gradient descent. Empirical results demonstrate its exceptional compression capability, maintaining high accuracy on the MNIST dataset with only 0.15\% of the original network parameters. Moreover, our framework enhances neural network interpretability, not only by allowing easy extraction of feature importance directly from the pruned network but also by enabling visualization of feature symmetry and the pathways of information propagation from features to outcomes. Although the pruning strategy is learned through deep learning, it is surprisingly intuitive and understandable, focusing on selecting key representative features and exploiting data patterns to achieve extreme sparse pruning. We believe our method opens a promising new avenue for deep learning pruning and the creation of interpretable machine learning systems.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems of computational efficiency and model interpretability brought about by the rapid increase in the number of parameters in deep - learning models. Specifically: 1. **Over - parameterization problem**: As deep - learning models become more and more complex, the number of parameters increases dramatically, leading to a significant increase in training and inference costs and reducing the interpretability of the model. 2. **Neural network pruning challenges**: Finding a method that can reduce the number of parameters as much as possible while maintaining the performance of the model is an NP - hard combinatorial optimization problem. Traditional methods such as heuristic search and iterative magnitude pruning (IMP) have limitations and it is difficult to achieve both efficient compression and high precision simultaneously. To solve these problems, the author proposes a new neural network pruning framework based on the Gumbel - Softmax technique. This framework can simultaneously optimize network weights and topological structures through stochastic gradient descent (SGD), thereby achieving the following goals: - **Efficient compression ability**: On the MNIST dataset, only 0.15% of the original network parameters (404 weights) are retained, and a relatively high classification accuracy can still be maintained. - **Enhanced model interpretability**: Through the pruned network, feature importance can be directly extracted, and feature symmetry and information propagation paths can be visualized to help understand the relationship between input variables and output variables. In short, this paper is committed to developing a neural network pruning method that can greatly compress the model scale and improve the model interpretability at the same time.