Discrete Cosine Transform for Filter Pruning

Chen Yaosen,Zhou Renshuang,Guo Bing,Shen Yan,Wang Wei,Wen Xuming,Suo Xinhua
DOI: https://doi.org/10.1007/s10489-022-03604-2
IF: 5.3
2023-01-01
Applied Intelligence
Abstract:Neural network filter pruning has demonstrated its effectiveness for deploying the models with fewer resources and efficient inference. However, the process of pruning networks in existing methods is complex and inefficient. This paper use Discrete Cosine Transform (DCT) to transform the feature map to the frequency domain and propose a simple and effective filter importance calculation method for filter pruning called DCTPruning. The important information of an image is usually contained in the low-frequency part. Discrete cosine transform transforms the image from the spatial domain to the frequency domain. The high-frequency part can be removed and lossy compressed without affecting the storage of important information. This study finds that this phenomenon is also applicable to the feature map of neural networks. Based on the discrete cosine transform, this study proposes a discrete cosine transform pruning method. A discrete cosine transform is used to calculate the importance of each filter in the neural network feature map, and the filter is pruned according to its importance. The proposed method not only achieved a good result in the classification task but also in the saliency object detection task. For the classification task, compared with the existing state-of-the-art, the proposed method has a significant improvement in terms of FLOPs and parameter reduction and maintains the accuracies. For example, for VGG-16 on CIFAR-10, this study achieves the parameters are reduced by 94.2% and FLOPs are reduced by 84.1%, the accuracy is only reduced by 1.43% (92.53% vs 93.96%); for ResNet-50 on ImageNet, DCTPruning provide the FLOPs are reduced by 74.1% and parameters are reduced by 70.8%, and the Top-1 accuracy is only reduced by 3.84% (72.31% vs 76.15%), the Top-5 accuracy is only reduced by 2.10% (90.77% vs 92.87%). For the saliency object detection task, this study also performs effective network pruning and achieves great model size reduction while keeping a similar performance.
What problem does this paper attempt to address?