Abstract:Computing-in-memory (CIM) relieves the Von Neumann bottleneck by storing the weights of neural networks in memory arrays. However, two challenges still exist, hindering the efficient acceleration of convolutional neural networks (CNN) in artificial intelligence (AI) edge devices. Firstly, the activations for sliding window (SW) operations in CNN still bring high memory access pressure. This can be alleviated by increasing the SW parallelism, but simple array replication suffers from poor array utilization and large peripheral circuits overhead. Secondly, the partial sums from individual CIM arrays, which are usually accumulated to obtain the final sum, introduce large latency due to enormous shift-and-add operations. Moreover, high-resolution ADCs are also needed to reduce the quantization error of partial sums, further increasing the hardware costs. In this paper, a hardware-efficient CIM accelerator, ARBiS, is proposed with improved activation reusability and bit-scalable matrix-vector-multiplication (MVM) for CNN acceleration in AI edge applications. The cyclic-shift weight duplication exploits a third dimension of receptive field (RF) depth for SW weight mapping to reduce the memory accesses of activations, improving the array utilization. The parasitic-capacitance charge sharing is employed to realize high-precision analog MVM in order to reduce the ADC cost. Compared with conventional architectures, ARBiS with parallel processing of 9 SW operations achieves 56.6%~58.8% alleviation of memory access pressure. Meanwhile, ARBiS configured with 8-bit ADCs saves 92.53%~94.53% ADC energy consumption. An ARBiS accelerator is evaluated to realize a computational efficiency (CE) of 10.28 (10.43) TOPS/mm2, an energy efficiency (EE) of 91.19 (112.36) TOPS/W with 8-bit (4-bit) ADCs, achieving $11.4\sim 11.7\times $ ( $11.6\sim 11.8\times $ ), $1.1\sim 3.3\times $ ( $1.4\sim 4\times $ ) improvements over state-of-the-art works, respectively.

An Energy-Efficient Mixed-Bit CNN Accelerator With Column Parallel Readout for ReRAM-Based In-Memory Computing

A 1T2R1C ReRAM CIM Accelerator with Energy-Efficient Voltage Division and Capacitive Coupling for CNN Acceleration in AI Edge Applications.

An Energy Efficient Computing-in-Memory Accelerator With 1T2R Cell and Fully Analog Processing for Edge AI Applications

AEPE: an Area and Power Efficient RRAM Crossbar-Based Accelerator for Deep CNNs

RRAM Based Buffer Design for Energy Efficient CNN Accelerator.

A Low-Power Charge-Domain Bit-Scalable Readout System for Fully-Parallel Computing-in-Memory Accelerators

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

ReHy: A ReRAM-based Digital/Analog Hybrid PIM Architecture for Accelerating CNN Training

Mixed Size Crossbar Based RRAM CNN Accelerator with Overlapped Mapping Method

A 28 nm 81 Kb 5995.3 TOPS/W 4T2R ReRAM Computing-in-Memory Accelerator With Voltage-to-Time-to-Digital Based Output

ReDy: A Novel ReRAM-centric Dynamic Quantization Approach for Energy-efficient CNN Inference

Design Framework for SRAM-Based Computing-In-Memory Edge CNN Accelerators

Low Bit-Width Convolutional Neural Network on RRAM

An Energy-Efficient Quantized and Regularized Training Framework for Processing-In-Memory Accelerators

XB-SIM∗: A Simulation Framework for Modeling and Exploration of ReRAM-based CNN Acceleration Design

NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration

An Energy-Efficient Floating-Point Compute SRAM with Pipelined In-Memory Bit-Parallel Exponent and Bitwise Mantissa Processing

APIM: An Antiferromagnetic MRAM-Based Processing-In-Memory System for Efficient Bit-level Operations of Quantized Convolutional Neural Networks

A ReRAM-Based Computing-in-Memory Convolutional-Macro with Customized 2T2R Bit-Cell for AIoT Chip IP Applications

ARBiS: A Hardware-Efficient SRAM CIM CNN Accelerator with Cyclic-Shift Weight Duplication and Parasitic-Capacitance Charge Sharing for AI Edge Application

Efficient Implementation of Multi-Channel Convolution in Monolithic 3D ReRAM Crossbar