MCCP: Multi-Collaboration Channel Pruning for Model Compression

Yuanchao Yan,Bo Liu,Weiwei Lin,Yuping Chen,Keqin Li,Jiangtao Ou,Chengyuan Fan
DOI: https://doi.org/10.1007/s11063-022-10984-6
IF: 2.565
2023-01-01
Neural Processing Letters
Abstract:It is difficult to load large deep neural networks on resource-constrained devices. Channel pruning can compress the model and effectively reduce the resource demand to solve this problem. However, most channel pruning methods evaluate channels one-sidedly and sometimes remove important channels incorrectly. Thus, we propose a new multi-collaboration channel pruning (MCCP) method by analyzing the input and output of the batch normalization (BN) layer. The importance of the channel is evaluated by combining the weights of the convolution layer and the two learnable parameters of the BN layer to achieve more reasonable pruning. Besides, we impose polarization regularization on the scaling factors of neurons to make them easier to distinguish between important and unimportant channels to minimize the performance loss of the model after pruning. We confirm the effect of our method. MCCP reduces the number of parameters of the YOLOv3 model by 95.9%, improves the inference speed by 3.8 times, compresses the model volume to 9.6MB, and has comparable recognition accuracy.
What problem does this paper attempt to address?