Live Demonstration for Input-Sparsity-Aware RRAM Processing-in-Memory Chip

Junjie Wang,Shuang Liu,Ruicheng Pan,Shiqin Yan,Yihe Liu,Yang Liu
DOI: https://doi.org/10.1109/iscas58744.2024.10558412
2024-01-01
Abstract:This paper presents a live demonstration of an RRAM processing-in-memory (PIM) chip in which the input sparsity is exploited to reduce power consumption and increase the throughput of the PIM chip. An offline quantization-aware training (QAT) is employed to fine-tune models to be suitable for the 4-bit PIM chip. Post-QAT, the model exhibited accuracy of 90.08% on the test dataset. Interestingly, we found that the input sparsity of input activation is always over 90%. This high level of sparsity proves advantageous, contributing substantially to both throughput and energy efficiency of the PIM chip. This design yields a throughput of 410 Gops, which is 9 times higher than the design without input sparsity awareness.
What problem does this paper attempt to address?