Abstract:RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.

HD-CIM: Hybrid-Device Computing-In-Memory Structure Based on MRAM and SRAM to Reduce Weight Loading Energy of Neural Networks

An 8-Bit in Resistive Memory Computing Core with Regulated Passive Neuron and Bitline Weight Mapping

A Low-Power In-Memory Multiplication and Accumulation Array with Modified Radix-4 Input and Canonical Signed Digit Weights

HiT-CIM: A High-Throughput Compute-In-Memory SRAM Architecture with Simultaneous Weight Loading/Computing and Balance Capabilities

CREAM: Computing in ReRAM-Assisted Energy- and Area-Efficient SRAM for Reliable Neural Network Acceleration.

A 28nm 8928Kb/mm 2 -Weight-Density Hybrid SRAM/ROM Compute-in-Memory Architecture Reducing >95% Weight Loading from DRAM.

A 28-Nm 36 Kb SRAM CIM Engine with 0.173 $\mu $m$^{2}$ 4T1T Cell and Self-Load-0 Weight Update for AI Inference and Training Applications

A Digital SRAM Computing-in-Memory Design Utilizing Activation Unstructured Sparsity for High-Efficient DNN Inference

TL-nvSRAM-CIM: Ultra-High-Density Three-Level ReRAM-Assisted Computing-in-nvSRAM with DC-Power Free Restore and Ternary MAC Operations

DS-CIM: A 40nm Asynchronous Dual-Spike Driven, MRAM Compute-In-Memory Macro for Spiking Neural Network

RRAM-DNN: an RRAM and Model-Compression Empowered All-Weights-On-Chip DNN Accelerator

A Multiply-Less Approximate SRAM Compute-In-Memory Macro for Neural-Network Inference

Energy-Efficient Hybrid-RAM with Hybrid Bit-Serial Based VMM Support

Cambricon-M: A Fibonacci-Coded Charge-Domain SRAM-Based CIM Accelerator for DNN Inference

A compute-in-memory chip based on resistive random-access memory

Novel Low-Power Computing-In-Memory (CIM) Design for Binary and Ternary Deep Neural Networks by Using 8T XNOR SRAM

An Energy-Efficient Floating-Point Compute SRAM with Pipelined In-Memory Bit-Parallel Exponent and Bitwise Mantissa Processing

Sparsity-Oriented MRAM-Centric Computing for Efficient Neural Network Inference

Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

Energy-efficient SNN Architecture using 3nm FinFET Multiport SRAM-based CIM with Online Learning

An Energy Efficient Computing-in-Memory Accelerator With 1T2R Cell and Fully Analog Processing for Edge AI Applications