Multi-Dimensional Dynamic Pruning: Exploring Spatial and Channel Fuzzy Sparsity

Mingwen Shao,Jiandong Kuang,Chao Wang,Wangmeng Zuo,Guoyin Wang
DOI: https://doi.org/10.1109/tfuzz.2024.3363220
IF: 12.253
2024-01-01
IEEE Transactions on Fuzzy Systems
Abstract:Dynamic pruning is an effective model compression method to reduce the computational cost of networks. However, existing dynamic pruning methods are limited to pruning along a single dimension (channel, spatial or depth), which cannot maximally excavate the redundancy of the network. Meanwhile, most of the current state-of-the-arts usually implement dynamic pruning via masked-out partial channels and pixels for training, while failing to accelerate the inference speed. To tackle these limitations, we propose a novel fuzzy-based Multi-Dimensional Dynamic Pruning (MDDP) paradigm to dynamically compress neural networks along both the channel and spatial dimensions. Specifically, we design a multi-dimensional fuzzy-mask block to simultaneously learn which spatial positions or channels are redundant and need to be pruned. Then, the Gumbel-Softmax trick combined with a sparsity loss is introduced to train these mask modules in an end-to-end manner. During the testing stage, we convert features and convolution kernels into two matrices respectively, and then implement sparse convolution through matrix multiplication to accelerate the network inference. Extensive experiments demonstrate that our method outperforms existing methods in terms of accuracy and computational cost. For instance, on the CIFAR-10 dataset, our method prunes 68% FLOPs of ResNet-56 with only a 0.07% Top-1 accuracy drop
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?