W-AMA: Weight-aware Approximate Multiplication Architecture for Neural Processing

Bo Liu,Renyuan Zhang,Qiao Shen,Zeju Li,Na Xie,Yuanhao Wang,Chonghang Xie,Hao Cai
DOI: https://doi.org/10.1016/j.compeleceng.2023.108921
IF: 4.152
2023-01-01
Computers & Electrical Engineering
Abstract:This paper presents the Weight-aware Approximate Multiplication Architecture (W-AMA) for Deep Neural Networks (DNNs). Considering the Gaussian-like weight distribution, it deploys an accuracy-configurable computing component to improve the computational efficiency. Two techniques for effectively integrating the W-AMA into DNN accelerator are presented: (1) A Cartesian Genetic Programming (CGP) based approximate multiplier is designed and selectable to compute the Least Significant Bit (LSB) for a higher accuracy mode. The Reward–Penalty-Coefficient (RPC) is proposed to achieve the internal-compensation. (2) The Hessian-Aware-Approximation (HAA) method is utilized for hybrid approximate modes cross-layer mapping. Based on the W-AMA, an energy-efficient DNN accelerator is proposed and evaluated on 28 nm technology. It can achieve the energy efficiency of 9.6 TOPS/W, and the computational energy efficiency can be improved by 1.5× compared with the standard units, with an 0.52% accuracy loss on CIFAR-10 using ResNet-18.
What problem does this paper attempt to address?