Model Compression by Iterative Pruning with Knowledge Distillation and Its Application to Speech Enhancement

Zeyuan Wei,Hao Li,Xueliang Zhang
DOI: https://doi.org/10.21437/interspeech.2022-619
2022-01-01
Abstract:Over the past decade, deep learning has demonstrated its effectiveness and keeps setting new records in a wide variety of tasks. However, good model performance usually leads to a huge amount of parameters and extremely high computational complexity which greatly limit the use cases of deep learning models, particularly in embedded systems. Therefore, model compression is getting more and more attention. In this paper, we propose a compression strategy based on iterative pruning and knowledge distillation. Specifically, in each iteration, we first utilize a pruning criterion to drop the weights which have less impact on performance. Then, the model before pruning is used as a teacher to fine-tune the student which is the model after pruning. After several iterations, we get the final compressed model. The proposed method is verified on gated convolutional recurrent network (GCRN) and long short-term memory (LSTM) for single channel speech enhancement tasks. Experimental results show that the proposed compression strategy can dramatically reduce the model size by 40x without significant performance degradation for GCRN.
What problem does this paper attempt to address?