Magnificent Minified Models

Rich Harang,Hillary Sanders

2023-06-17

Abstract:This paper concerns itself with the task of taking a large trained neural network and 'compressing' it to be smaller by deleting parameters or entire neurons, with minimal decreases in the resulting model accuracy. We compare various methods of parameter and neuron selection: dropout-based neuron damage estimation, neuron merging, absolute-value based selection, random selection, OBD (Optimal Brain Damage). We also compare a variation on the classic OBD method that slightly outperformed all other parameter and neuron selection methods in our tests with substantial pruning, which we call OBD-SD. We compare these methods against quantization of parameters. We also compare these techniques (all applied to a trained neural network), with neural networks trained from scratch (random weight initialization) on various pruned architectures. Our results are only barely consistent with the Lottery Ticket Hypothesis, in that fine-tuning a parameter-pruned model does slightly better than retraining a similarly pruned model from scratch with randomly initialized weights. For neuron-level pruning, retraining from scratch did much better in our experiments.

Machine Learning

What problem does this paper attempt to address?

The paper primarily explores how to compress large neural networks through various methods to reduce their size and the number of parameters while maintaining model accuracy as much as possible. Specifically, the study compares several methods for parameter and neuron selection, including dropout-based neuron damage estimation, neuron merging, absolute value selection, random selection, and Optimal Brain Damage (OBD). It also proposes an improved version of the OBD method—OBD-SD. Additionally, the paper compares the effects of quantization techniques with these compression methods. The research finds that under extensive pruning, the OBD-SD method slightly outperforms all other methods. However, at the neuron level pruning, retraining the pruned model from scratch performs better. The results only slightly support the Lottery Ticket Hypothesis in the context of parameter-level pruning, indicating that fine-tuning the pruned model performs slightly better than retraining the model from scratch with randomly initialized weights. In summary, the paper aims to address the following issues: 1. **How to effectively compress neural networks**: By removing parameters or entire neurons to reduce model size while minimizing the negative impact on model accuracy. 2. **Comparison of different pruning methods**: Evaluating the effectiveness of various pruning methods and determining best practices. 3. **Performance comparison between pruning and retraining**: Verifying whether fine-tuning a pruned model is superior to retraining the model from scratch.

Magnificent Minified Models

Loss Constrains Added Squeeze and Excitation Blocks for Pruning Deep Neural Networks

Pruning Deep Neural Networks by Optimal Brain Damage

Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error

Small Contributions, Small Networks: Efficient Neural Network Pruning Based on Relative Importance

Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon

Neural network relief: a pruning algorithm based on neural activity

Detecting Dead Weights and Units in Neural Networks

Network Trimming: A Data-Driven Neuron Pruning Approach towards Efficient Deep Architectures

What is the State of Neural Network Pruning?

Exploring The Neural Burden In Pruned Models: An Insight Inspired By Neuroscience

Automatic Pruning for Quantized Neural Networks

Quantisation and Pruning for Neural Network Compression and Regularisation

"Understanding Robustness Lottery": A Geometric Visual Comparative Analysis of Neural Network Pruning Approaches

Beyond Pruning Criteria: The Dominant Role of Fine-Tuning and Adaptive Ratios in Neural Network Robustness

Optimization based Layer-wise Magnitude-based Pruning for DNN Compression

A roulette wheel-based pruning method to simplify cumbersome deep neural networks

Pruning at a Glance: Global Neural Pruning for Model Compression

On Compression of Unsupervised Neural Nets by Pruning Weak Connections