Learning Sparse Structured Ensembles with Stochastic Gradient MCMC Sampling and Network Pruning.

Yichi Zhang,Zhijian Ou
DOI: https://doi.org/10.1109/mlsp.2018.8516928
2018-01-01
Abstract:An ensemble of neural networks is known to be more robust and accurate than an individual network, however usually with linearly-increased cost in both training and testing. In this work, we propose a two-stage method to learn Sparse Structured Ensembles (SSEs) for neural networks. In the first stage, we run SG-MCMC with group sparse priors to draw an ensemble of samples from the posterior distribution of network parameters. In the second stage, we apply weight-pruning to each sampled network and then perform retraining over the remained connections. In this way of learning SSEs, we not only achieve high prediction accuracy but also reduce memory and computation cost in both training and testing. We conduct a series of evaluation experiments by learning SSE ensembles with both FNNs and LSTMs. To the best of our knowledge, this work represents the first methodology and empirical study of integrating SG-MCMC, group sparse prior and network pruning together for learning NN ensembles.
What problem does this paper attempt to address?