Pruning Deep Neural Networks by Optimal Brain Damage

Chao Liu,Zhiyong Zhang,Dong Wang
DOI: https://doi.org/10.21437/interspeech.2014-281
2014-01-01
Abstract:A main advantage of the deep neural network (DNN) model lies on the fact that no artificial assumptions are placed on the data distribution and model structure, which offers the possibility to learn very flexible models. This flexibility, however, may lead to highly redundant parameters, hence demanding computation and risk of over-fitting. Network pruning cuts off unimportant connections, and therefore can be used to produce parsimonious and well generalizable models.This paper proposes to utilize optimal brain damage (OBD) to conduct DNN pruning. OBD computes connection salience based on Hessians, and thus is sound in theory and reliable in practice. We present our implementation of OBD for DNNs, and demonstrate that the OBD pruning can produce very sparse DNNs while retaining the discriminative power of the original network to a large extent. By comparing with a simple magnitude-based pruning, we find that for weak pruned networks, pruning methods are unimportant since retraining can largely recover the function loss caused by pruning; while for highly pruned networks, sophisticated pruning methods (such as OBD) are clearly superior.
What problem does this paper attempt to address?