A 40nm 1mb 35.6 TOPS/W MLC NOR-Flash Based Computation-in-Memory Structure for Machine Learning

Yuxin Zhang,Sitao Zeng,Zhiguo Zhu,Zhaolong Qin,Chen Wang,Jingjing Li,Sanfeng Zhang,Yajuan He,Chunmeng Dou,Xin Si,Meng-Fan Chang,Qiang Li
DOI: https://doi.org/10.1109/iscas51556.2021.9401600
2021-01-01
Abstract:Computation-in-memory (CIM) is a feasible method to overcome "Von-Neumann bottleneck" with high throughput and energy efficiency. In this paper, we proposed a 1Mb Multi-Level (MLC) NOR Flash based CIM (MLFlash- CIM) structure with 40nm technology node. A multi-bit readout circuit was proposed to realize adaptive quantization, which comprises a current interface circuit, a multi-level analog shift amplifier (AS-Amp) and an 8-bit SAR-ADC. When applied to a modified VGG-16 Network with 16 layers, the proposed MLFlash-CIM can achieve 92.73% inference accuracy under CIFAR-10 dataset. This CIM structure also achieved a peak throughput of 3.277 TOPS and an energy efficiency of 35.6 TOPS/W with 4-bit multiplication and accumulation (MAC) operations.
What problem does this paper attempt to address?