131TOPS/W 8b ACIM Exploiting Weight-Embedded Auto-Accumulation and Supporting Symmetric Quantization Networks

Wei He,Puyi Bai,Hongyang Luo,Zhenghao Jin,Han Wu,Junyi Zhang,Xingchen Chao,Haiqi Liu,Yajuan He,Qiang Li
DOI: https://doi.org/10.1109/cicc60959.2024.10529000
2024-01-01
Abstract:Fixed-point neural networks are widely used for inference applications, where symmetric quantized weight is required to avoid additional input-data dependent terms [1], as shown in Fig. 1. However, most of the prior works do not support symmetric-quantized weight. It is often observed that only positive numbers can be accumulated in state-of-the-art digital CIM (DCIM) [3], mixed-signal CIM [4]–[6] and analog CIM (ACIM) [7]–[9] designs. With those asymmetric quantized weight, for each batch of data, an additional term has to be computed during the inference process, resulting in significant overhead in both latency and power, as it is equivalent to adding an extra channel. Some DCIM support symmetric-quantized weight with additional circuits [2], which consume additional power and area.
What problem does this paper attempt to address?