A 28nm 32kb SRAM Computing-in-Memory Macro with Hierarchical Capacity Attenuator and Input Sparsity-Optimized ADC for 4b Mac Operation

Kanglin Xiao,Xiaoxin Cui,Xin Qiao,Jiahao Song,Haoyang Luo,Xin'an Wang,Yuan Wang
DOI: https://doi.org/10.1109/tcsii.2023.3234620
2023-01-01
Abstract:Computing-in-memory (CIM) is an emerging approach for alleviating the Von-Neumann bottleneck of latency and energy overheads, and improving energy efficiency and throughput. In this brief, we present a novel CIM macro aimed at improving the energy efficiency and throughput of edge devices when running 4b multiply-and-accumulate (MAC) operations. The proposed architecture uses (1) a customized 9T1C bit-cell in charge-domain computation for sensing margin improvement and compact design; (2) a hierarchical capacity attenuator for 4b weight accumulation without complicated controlling switches and signals for throughput improvement; (3) an input sparsity-sensing-based flash analog-to-digital converters readout scheme to improve energy efficiency and throughput. Fabricated in 28nm CMOS technology, the proposed 32Kb SRAM CIM macro demonstrates an average energy efficiency of 646.6 TOPS/W (normalized to 4b/4b input/weight) and a throughput of 1638.4 GOPS while achieving 84.89% classification accuracy on the CIFAR-10 dataset at 4b precision in inputs and weights.
What problem does this paper attempt to address?