Efficient Model Compression for Bayesian Neural Networks

Diptarka Saha,Zihe Liu,Feng Liang
2024-11-01
Abstract:Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories, and resistance to adversarial attacks. This may be achieved via weight pruning or fully discarding certain input features. Here we demonstrate a novel strategy to emulate principles of Bayesian model selection in a deep learning setup. Given a fully connected Bayesian neural network with spike-and-slab priors trained via a variational algorithm, we obtain the posterior inclusion probability for every node that typically gets lost. We employ these probabilities for pruning and feature selection on a host of simulated and real-world benchmark data and find evidence of better generalizability of the pruned model in all our experiments.
Machine Learning,Applications
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to effectively compress Bayesian neural networks (BNN) while maintaining the model's predictive ability, in order to reduce its computational cost, improve the feasibility of deployment on devices with limited storage and memory, and enhance resistance to adversarial attacks. Specifically, the paper proposes a new strategy to achieve model compression by using spike - and - slab priors for parameter pruning. This method can not only provide effective uncertainty estimation, but also estimate regression coefficients and complexity parameters simultaneously, thereby achieving better model generalization performance. ### Core problems of the paper 1. **Model compression**: How to reduce the number of model parameters and computational complexity without sacrificing the model's predictive performance. 2. **Application of Bayesian methods**: How to use the spike - and - slab priors in the Bayesian framework to achieve parameter pruning and thus model compression. 3. **Quantification of uncertainty**: How to maintain or improve the model's uncertainty estimation ability during the model compression process to enhance the model's interpretability and reliability. ### Solutions The paper proposes a method based on variational inference (VI) to achieve parameter pruning by introducing spike - and - slab priors. The specific steps are as follows: 1. **Model setting**: - Model the weights \(W\) and binary latent variables \(Z\) using the spike - and - slab prior \(\pi(W, Z)\). - The form of the spike - and - slab prior is: \[ \pi(W, Z) = \prod_{i} \left[ \pi \cdot N(w_i; 0, \tau_1^2) \right]^{Z_i} \cdot \left[ (1 - \pi) \cdot N(w_i; 0, \tau_0^2) \right]^{1 - Z_i} \] where \(\tau_0^2 < \tau_1^2\). 2. **Variational objective function**: - Assume that the variational distributions \(q_p(Z)\) and \(q_\theta(W)\) are Bernoulli distribution and Gaussian distribution respectively. - The variational objective function \(J(\theta, p)\) is: \[ J(\theta, p) = - \mathbb{E}_{q_\theta(W)} \mathbb{E}_{q_p(Z)} \left[ \log p(D|W) + \log \frac{\pi(W, Z)}{q_\theta(W) q_p(Z)} \right] \] 3. **Optimization algorithm**: - Use the coordinate descent method to optimize the objective function \(J(\theta, p)\). - First, update the weight parameter \(\theta\) by gradient descent. - Then update the sparsity parameter \(p\) by a closed - form solution: \[ p_i^* = \frac{1}{1 + \exp \{A_i - B_i\}} \] where: \[ A_i = \frac{m_i^2 + \sigma_i^2}{2\tau_1^2} + \log \frac{\tau_1}{\pi}, \quad B_i = \frac{m_i^2 + \sigma_i^2}{2\tau_0^2} + \log \frac{\tau_0}{1 - \pi} \] 4. **Experimental verification**: - Through experiments on simulated data and real - world benchmark data, it is verified that the pruned model has better generalization performance. ### Summary This paper successfully achieves effective compression of Bayesian neural networks by introducing spike - and - slab priors and variational inference methods. This method can not only reduce the model's