Abstract:Model Compression has drawn much attention within the deep learning community recently. Compressing a dense neural network offers many advantages including lower computation cost, deployability to devices of limited storage and memories, and resistance to adversarial attacks. This may be achieved via weight pruning or fully discarding certain input features. Here we demonstrate a novel strategy to emulate principles of Bayesian model selection in a deep learning setup. Given a fully connected Bayesian neural network with spike-and-slab priors trained via a variational algorithm, we obtain the posterior inclusion probability for every node that typically gets lost. We employ these probabilities for pruning and feature selection on a host of simulated and real-world benchmark data and find evidence of better generalizability of the pruned model in all our experiments.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to effectively compress Bayesian neural networks (BNN) while maintaining the model's predictive ability, in order to reduce its computational cost, improve the feasibility of deployment on devices with limited storage and memory, and enhance resistance to adversarial attacks. Specifically, the paper proposes a new strategy to achieve model compression by using spike - and - slab priors for parameter pruning. This method can not only provide effective uncertainty estimation, but also estimate regression coefficients and complexity parameters simultaneously, thereby achieving better model generalization performance. ### Core problems of the paper 1. **Model compression**: How to reduce the number of model parameters and computational complexity without sacrificing the model's predictive performance. 2. **Application of Bayesian methods**: How to use the spike - and - slab priors in the Bayesian framework to achieve parameter pruning and thus model compression. 3. **Quantification of uncertainty**: How to maintain or improve the model's uncertainty estimation ability during the model compression process to enhance the model's interpretability and reliability. ### Solutions The paper proposes a method based on variational inference (VI) to achieve parameter pruning by introducing spike - and - slab priors. The specific steps are as follows: 1. **Model setting**: - Model the weights \(W\) and binary latent variables \(Z\) using the spike - and - slab prior \(\pi(W, Z)\). - The form of the spike - and - slab prior is: \[ \pi(W, Z) = \prod_{i} \left[ \pi \cdot N(w_i; 0, \tau_1^2) \right]^{Z_i} \cdot \left[ (1 - \pi) \cdot N(w_i; 0, \tau_0^2) \right]^{1 - Z_i} \] where \(\tau_0^2 < \tau_1^2\). 2. **Variational objective function**: - Assume that the variational distributions \(q_p(Z)\) and \(q_\theta(W)\) are Bernoulli distribution and Gaussian distribution respectively. - The variational objective function \(J(\theta, p)\) is: \[ J(\theta, p) = - \mathbb{E}_{q_\theta(W)} \mathbb{E}_{q_p(Z)} \left[ \log p(D|W) + \log \frac{\pi(W, Z)}{q_\theta(W) q_p(Z)} \right] \] 3. **Optimization algorithm**: - Use the coordinate descent method to optimize the objective function \(J(\theta, p)\). - First, update the weight parameter \(\theta\) by gradient descent. - Then update the sparsity parameter \(p\) by a closed - form solution: \[ p_i^* = \frac{1}{1 + \exp \{A_i - B_i\}} \] where: \[ A_i = \frac{m_i^2 + \sigma_i^2}{2\tau_1^2} + \log \frac{\tau_1}{\pi}, \quad B_i = \frac{m_i^2 + \sigma_i^2}{2\tau_0^2} + \log \frac{\tau_0}{1 - \pi} \] 4. **Experimental verification**: - Through experiments on simulated data and real - world benchmark data, it is verified that the pruned model has better generalization performance. ### Summary This paper successfully achieves effective compression of Bayesian neural networks by introducing spike - and - slab priors and variational inference methods. This method can not only reduce the model's

Efficient Model Compression for Bayesian Neural Networks

Improved Model Compression Method Based on Information Entropy

Structured Bayesian Compression for Deep Neural Networks Based on The Turbo-VBI Approach

Efficient Bayesian CNN Model Compression using Bayes by Backprop and L1-Norm Regularization

On Compression Principle and Bayesian Optimization for Neural Networks

Dirichlet Pruning for Neural Network Compression

Network Compression Via Recursive Bayesian Pruning.

On Model Compression for Neural Networks: Framework, Algorithm, and Convergence Guarantee

A Survey of Model Compression for Deep Neural Networks

Multi-Resolution Model Compression for Deep Neural Networks: A Variational Bayesian Approach

A Model Compression Method Using Significant Data and Knowledge Distillation

Compressing Neural Networks Using the Variational Information Bottleneck.

Compression with Bayesian Implicit Neural Representations

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

Efficient Network Compression Through Smooth-Lasso Constraint

Spike-and-slab shrinkage priors for structurally sparse Bayesian neural networks

Model Compression for Deep Neural Networks: A Survey

Model compression as constrained optimization, with application to neural nets. Part V: combining compressions

Neural Network Compression using Binarization and Few Full-Precision Weights

Neural Network Compression Via Low Frequency Preference

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction