Pruning Early Exit Networks

Alperen Görmez,Erdem Koyuncu
DOI: https://doi.org/10.48550/arXiv.2207.03644
2022-07-08
Abstract:Deep learning models that perform well often have high computational costs. In this paper, we combine two approaches that try to reduce the computational cost while keeping the model performance high: pruning and early exit networks. We evaluate two approaches of pruning early exit networks: (1) pruning the entire network at once, (2) pruning the base network and additional linear classifiers in an ordered fashion. Experimental results show that pruning the entire network at once is a better strategy in general. However, at high accuracy rates, the two approaches have a similar performance, which implies that the processes of pruning and early exit can be separated without loss of optimality.
Machine Learning,Computer Vision and Pattern Recognition,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to reduce the computational cost of deep - learning models while maintaining high performance. Specifically, the author combines two methods to achieve this goal: pruning and early - exit networks. Pruning refers to removing redundant weights in a neural network to reduce the amount of computation; while early - exit networks refer to setting additional exit points at different layers of the network, so that some samples can obtain prediction results without going through the entire network, thereby saving computational resources. In the paper, the author evaluates two methods of pruning early - exit networks: 1. **Prune the entire network at once**: that is, prune the base network and the attached linear classifier simultaneously. 2. **Prune in sequence**: first prune the base network, and then prune the attached linear classifier. Through experiments, the author finds that pruning the entire network at once usually performs better, but under high - precision requirements, the performance of the two methods is similar. This indicates that the processes of pruning and early - exit can be carried out separately without significantly affecting the final performance. In addition, the research also shows that through an appropriate pruning strategy, the computational cost can be significantly reduced without sacrificing performance.