FreezeOut: Accelerate Training by Progressively Freezing Layers

Andrew Brock,Theodore Lim,J.M. Ritchie,Nick Weston
DOI: https://doi.org/10.48550/arXiv.1706.04983
2017-06-19
Abstract:The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at <a class="link-external link-https" href="https://github.com/ajbrock/FreezeOut" rel="external noopener nofollow">this https URL</a>
Machine Learning
What problem does this paper attempt to address?