Conv-inheritance: A hardware-efficient method to compress convolutional neural networks for edge applications

Yang Yang,Chao Wang,Lei Gong,Min Wu,Xuehai Zhou
DOI: https://doi.org/10.1016/j.neucom.2021.02.106
IF: 6
2022-05-01
Neurocomputing
Abstract:Convolutional Neural Networks (CNNs) have won tremendous success in various applications, such as image recognition and natural language processing. Due to the computing-intensive and memory-intensive features of CNN models, it is challenging to deploy them on devices with limited resources and tight power budgets. To address this limitation, pruning, quantization, weights sharing, and other methods have been proposed as efficient solutions to compress CNN models. However, these works usually do not optimize the hardware design when compressing the CNNs. In this paper, we introduce Conv-Inheritance, a hardware-efficient compression method for CNNs by reducing the number of convolution operations to simultaneously reduce the inference time and hardware computing resources on chip of CNN edge devices. We also develop an extended version, called adaptive Conv-Inheritance, to further improve the inference performance by considering the similarity between pixels. We applied our Conv-Inheritance method to compressing various CNN models and performed extensive experiments on different image datasets. Experimental results demonstrate that Conv-Inheritance is both hardware-efficient and time-efficient on edge devices. Besides, Conv-Inheritance is a plug-and-play compression method for CNNs, and we can run Conv-Inheritance on top of other compression methods without any conflicts.
computer science, artificial intelligence
What problem does this paper attempt to address?