A 578-TOPS/W RRAM-Based Binary Convolutional Neural Network Macro for Tiny AI Edge Devices

Lixun Wang,Yuejun Zhang,Pengjun Wang,Jianguo Yang,Huihong Zhang,Gang Li,Qikang Li
DOI: https://doi.org/10.1109/tvlsi.2024.3469217
2024-01-01
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Abstract:The novel nonvolatile computing-in-memory (nvCIM) technology enables data to be stored and processed in situ, providing a feasible solution for the widespread deployment of machine learning algorithms in edge AI devices. However, current nvCIM approaches based on weighted current summation face challenges such as device nonidealities and substantial time, storage, and energy overheads when handling high-precision analog signals. To address these issues, we propose a resistive random access memory (RRAM)-based binary convolution macro for constructing a complete binary convolutional neural network (BCNN) hardware circuit, accelerating edge AI applications with low-weight precision. This macro performs error compensation at the circuit level and provides stable rail-to-rail output, eliminating the need for any ADCs or processor to perform auxiliary computations. Experimental results demonstrate that the proposed BCNN full-hardware computing system achieves on-chip recognition accuracy of 90.7% (98.64%) on the CIFAR10 (MNIST) dataset, which represents a decrease of 0.98% (0.59%) compared to software recognition accuracy. In addition, this binary convolution macro achieves a maximum throughput of 320 GOPS and a peak energy efficiency of 578 TOPS/W at 136 MHz.
What problem does this paper attempt to address?