Sparsity-Aware Clamping Readout Scheme for High Parallelism and Low Power Nonvolatile Computing-in-Memory Based on Resistive Memory

Linfang Wang,Wang Ye,Junjie An,Chunmeng Dou,Qi Liu,Meng-Fan Chang,Ming Liu
DOI: https://doi.org/10.1109/iscas51556.2021.9401670
2021-01-01
Abstract:The input parallelism of resistive memory (RRAM) based nonvolatile computing-in-memory (nvCIM) structure is limited by the signal margin as well as the readout precision. In this work, we propose a sparsity-aware clamping (SAC) scheme and its circuit implementation for nvCIM by co-design of circuit and algorithm. It can adaptively tune the quantized range and resolution of the readout circuit according to the degree of sparsity in neural network models. As a result, the SAC scheme can effectively increase the input parallelism of nvCIMs without incurring degradation on the signal margin or increasing the hardware cost for analogue readout. A case study on processing a multi-layer perceptron (MLP) model with the proposed nvCIM structure shows that the SAC scheme can improve the throughput by 2 times and increase the energy efficiency by 25.35% with negligible inference accuracy loss.
What problem does this paper attempt to address?