Dynamic CNN Accelerator Supporting Efficient Filter Generator with Kernel Enhancement and Online Channel Pruning

Chen Tang,Wenyu Sun,Wenxun Wang,Yongpan Liu
DOI: https://doi.org/10.1109/asp-dac52403.2022.9712483
2022-01-01
Abstract:Deep neural network achieves exciting performance in several tasks with heavy storing and computing costs. Previous works adopt pruning-based methods to slim deep network. For traditional pruning, either the convolution kernel or the network inference is static, which cannot fully compress the model parameter and restrains their performance. In this paper, we propose an online pruning algorithm to support dynamic kernel generation and dynamic network inference at the same time. Two novel techniques including the filter generator and the importance-level based channel pruning are proposed. Moreover, we validate the success of the proposed method by the implementation on Ultra96-v2 FPGA. Compared with state-of-art static or dynamic pruning methods, our method can reduce the top-5 accuracy drop by nearly 50% for ResNet model on ImageNet at similar compressing level. It can also achieve better accuracy while up to 50% fewer weights are reduced to be saved on chip.
What problem does this paper attempt to address?