3A-Reram: Adaptive Activation Accumulation in ReRAM-Based CNN Accelerator
Zihan Zhang,Jianfei Jiang,Qin Wang,Zhigang Mao,Naifeng Jing
DOI: https://doi.org/10.1109/tcad.2023.3297968
IF: 2.9
2024-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:ReRAM-based computing is good at accelerating convolutional neural network (CNN) inference due to its high computing parallelism, but its rigid crossbar structure may become less efficient in the face of the random data sparsity abundant in CNNs. In this study, we propose $3A$ -ReRAM, a novel crossbar architecture that can dynamically predict the accumulated results to enable adaptive activation accumulation, so that both zero and small values in feature map can be exploited in each matrix-vector multiplication (MVM) operation for speedup. To dynamically predict the results, we propose an efficient parallel predictor to find larger adapted boxes for increased computing parallelism without hurting accuracy. For a better scheduling between the dynamic predictions, we propose an efficient input window management with light-weight hardware support. With dynamic prediction and calculation, $3A$ -ReRAM architecture naturally fits the ReRAM crossbar structure but enables a totally different way to dynamically exploit the sparsity and small values in feature maps. It greatly improves the performance by increasing the computing parallelism and saves energy consumption by much less analog-digital conversions. The evaluation results show that $3A$ -ReRAM architecture can increase the performance by up to $13.03\times $ , $16.31\times $ , $2.46\times $ , and $2.58\times $ compared to ReRAM-based CNN accelerators ISAAC, PUMA (sparsity-unaware) and SRE, FORMS (sparsity-aware), and the total energy can be reduced by $8.93\times $ , $10.07\times $ , $2.97\times $ , and $4.58\times $ , respectively.