A Precision-Scalable Deep Neural Network Accelerator with Activation Sparsity Exploitation

Wenjie Li,Aokun Hu,Ningyi Xu,Guanghui He
DOI: https://doi.org/10.1109/tcad.2023.3310916
IF: 2.9
2024-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:To meet the demand in a wide range of practical applications, precision-scalable deep neural network (DNN) accelerators are becoming an unavoidable trend. On the other hand, it has been demonstrated that a DNN accelerator may achieve better computation efficiency through exploiting the sparsity. Therefore, DNN accelerators with both precision scalability and sparsity exploitation are expected to have better performance. In this article, we propose an efficient precision-scalable DNN accelerator that can exploit the sparsity of activations. The precision scalability is obtained from the decomposable multiplier which is inspired by the well-known design, Bit Fusion. Besides, a zero-skipping scheme is adopted to leverage the inherent sparsity of activations. We first modify the architecture of the conventional fusion unit (FU) to make it amenable to the zero-skipping scheme. Then, a segmentation approach is devised to tackle the memory access conflict. Furthermore, a sparsity-aware mapping method is proposed to balance the workload of processing elements (PEs). Moreover, we present a bit-splitting strategy which can take advantage of the sparsity in the bit level. Compared with the state-of-the-art precision-scalable designs, our proposed accelerator can provide speedups of $4.12\times $ , $4.07\times $ , and $6.62\times $ in the precision modes $8b\times 8b$ , $4b\times 4b$ , and $2b\times 2b$ , respectively. Meanwhile, it also achieves $3.92\times $ peak area efficiency and competitive peak energy efficiency.
What problem does this paper attempt to address?