Abstract:Compute-in-memory (CIM) based on emerging nonvolatile memory (eNVM) is an effective way to deploy neural networks to low-power edge devices for both storage and computation. NVMs such as ReRAM have been widely used in CIM. Meanwhile, MRAM has higher read and write cycles, lower device and cycle variation and a lower bit error rate, making it equally attractive for storage. However, the high read current and low on/off ratio result in large energy consumption in MRAM read limiting its large-scale application in CIM. The spiking neural network (SNN) represents the information as sparse spike sequences and facilitates hardware to achieve low-power computing by taking advantage of its spatial-temporal sparsity. To further increase the input sparsity of SNN and reduce the read energy consumption, this paper proposes ADC-free, dual-spike (DS) -CIM macro, a spiking MRAM CIM macro driven by asynchronous dual spikes. Compared to the conventional rate coding, our dual-spike coding method uses only 2 spikes to encode the information without losing accuracy. Moreover, the event-driven feature allows the macro to have sub-nW static power consumption. Our DS-CIM macro achieves comparable or higher accuracy while maintaining very low energy consumption. Specifically, it achieves accuracies of 96.99%, 82.87%, 90.00%, and 85.97% for digit classification, image classification, gesture recognition, and action recognition tasks, with energy consumption of only 8.07nJ, 71.26nJ, 729.3nJ, and 369.82nJ, respectively. These results emphasize the significance of DS-CIM and provide ideas for low-power inference on edge devices.

A 65 Nm 73 Kb SRAM-Based Computing-In-Memory Macro with Dynamic-Sparsity Controlling

An 8-Bit in Resistive Memory Computing Core with Regulated Passive Neuron and Bitline Weight Mapping

A Low-Power In-Memory Multiplication and Accumulation Array with Modified Radix-4 Input and Canonical Signed Digit Weights

A Digital SRAM Computing-in-Memory Design Utilizing Activation Unstructured Sparsity for High-Efficient DNN Inference

A 28nm 32kb SRAM Computing-in-Memory Macro with Hierarchical Capacity Attenuator and Input Sparsity-Optimized ADC for 4b Mac Operation

DS-CIM: A 40nm Asynchronous Dual-Spike Driven, MRAM Compute-In-Memory Macro for Spiking Neural Network

A 128 Kb DAC-less 6T SRAM computing-in-memory macro with prioritized subranging ADC for AI edge applications

A Multiply-Less Approximate SRAM Compute-In-Memory Macro for Neural-Network Inference

A 28nm 8Kb Reconfigurable SRAM Computing-In-Memory Macro With Input-Sparsity Optimized DTC for Multi-mode MAC Operations

An Edram Based Computing-in-Memory Macro with Full-Valid-Storage and Channel-Wise-Parallelism for Depthwise Neural Network

14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8tops/w System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse.

A 1–8b Reconfigurable Digital SRAM Compute-in-Memory Macro for Processing Neural Networks

An Energy-Efficient Computing-in-Memory NN Processor with Set-Associate Blockwise Sparsity and Ping-Pong Weight Update

An XOR-10T SRAM computing-in-memory macro with current MAC operations and time-to-digital conversion for BNN edge processors

A 4-Kb 1-to-8-bit Configurable 6T SRAM-Based Computation-in-Memory Unit-Macro for CNN-Based AI Edge Processors

Sparsity-Aware Non-Volatile Computing-In-Memory Macro with Analog Switch Array and Low-Resolution Current-Mode ADC.

14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8 TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy …

An RRAM-Based Digital Computing-in-Memory Macro with Dynamic Voltage Sense Amplifier and Sparse-Aware Approximate Adder Tree

A 28-Nm 36 Kb SRAM CIM Engine with 0.173 $\mu $m$^{2}$ 4T1T Cell and Self-Load-0 Weight Update for AI Inference and Training Applications

34.3 A 22nm 64kb Lightning-Like Hybrid Computing-in-Memory Macro with a Compressed Adder Tree and Analog-Storage Quantizers for Transformer and CNNs.

7.8 A 22nm Delta-Sigma Computing-In-Memory (Δ∑CIM) SRAM Macro with Near-Zero-Mean Outputs and LSB-First ADCs Achieving 21.38TOPS/W for 8b-MAC Edge AI Processing