Single-shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration

Bofeng Jiang,Jun Chen,Yong Liu
DOI: https://doi.org/10.1016/j.engappai.2023.106816
IF: 8
2023-01-01
Engineering Applications of Artificial Intelligence
Abstract:Applying CNN on embedded systems is challenging due to model size limitations. Pruning and quantization can help, but are time-consuming to apply separately. Our Single-Shot Pruning and Quantization strategy addresses these issues by quantizing and pruning in a single process. We evaluated our method on CIFAR-10 and CIFAR-100 datasets for image classification. Our model is 69.4% smaller with little accuracy loss, and runs 6–8 times faster on NVIDIA Xavier NX hardware.
What problem does this paper attempt to address?