Regularization-Free Structural Pruning for GPU Inference Acceleration

Chuliang Guo,Yanbing Yang,Li Zhang,Shaodi Wang,He Li,Keyu Long,Xunzhao Yin,Cheng Zhuo
DOI: https://doi.org/10.1109/ISQED51717.2021.9424299
2021-01-01
Abstract:Pruning is recently prevalent in deep neural network compression to save memory footprint and accelerate network inference. Unstructured pruning, i.e., fine-grained pruning, helps preserve model accuracy, while structural pruning, i.e., coarse-grained pruning, is preferred for general-purpose platforms such as GPUs. This paper proposes a regularization-free structural pruning scheme to take advant...
What problem does this paper attempt to address?