Abstract:Brain-inspired spiking neural networks (SNNs) are considered energy-efficient alternatives to conventional deep neural networks (DNNs). By adopting event-driven information processing, SNNs can significantly reduce the computational demands associated with DNNs, while still achieving comparable performance. However, current SNNs primarily prioritize high accuracy by constructing complex neuron models that generate sparse spikes. Unfortunately, this approach results in low energy efficiency and high latency, posing a significant challenge for deploying SNNs at the edge. Furthermore, the dominant computation in SNNs, which involves spike-wise Accumulate-Compare operations, is well-suited for Computing-in-Memory (CIM) architectures. However, exploiting high parallel processing and spike sparsity in CIM-based SNN accelerators is challenging due to the irregularity and time dependency of spikes. To address these limitations, the paper proposes COMPASS, a SRAM-based CIM architecture for efficient SNNs. We first introduce an efficient method to exploit irregular sparsity for both input spikes (explicit) and output spikes (implicit). This is achieved through a speculation mechanism that exploit dynamic spike patterns, enabling lean hardware for sparsity utilization. Additionally, the CIM architecture is carefully modified to facilitate dynamic spike pattern generation and exploitation with minimal overhead. Moreover, we design an adaptive dataflow with temporal spike representation tailored for input/output spikes, reducing memory footprint and enabling parallel execution. Comprehensive evaluation results demonstrate that COMPASS can achieve 26.7x end-to-end speedup over recent SNN accelerators hardware implementation with up to 386.7x less energy per inference.

Energy-efficient SNN Architecture using 3nm FinFET Multiport SRAM-based CIM with Online Learning

An 8-Bit in Resistive Memory Computing Core with Regulated Passive Neuron and Bitline Weight Mapping

Tempo-CIM: A RRAM Compute-in-Memory Neuromorphic Accelerator with Area-Efficient LIF Neuron and Split-Train-Merged-Inference Algorithm for Edge AI Applications.

Novel Low-Power Computing-In-Memory (CIM) Design for Binary and Ternary Deep Neural Networks by Using 8T XNOR SRAM

DS-CIM: A 40nm Asynchronous Dual-Spike Driven, MRAM Compute-In-Memory Macro for Spiking Neural Network

A Digital SRAM Computing-in-Memory Design Utilizing Activation Unstructured Sparsity for High-Efficient DNN Inference

A 28-Nm 135.19 TOPS/W Bootstrapped-SRAM Compute-in-Memory Accelerator with Layer-Wise Precision and Sparsity

A Heterogeneous Microprocessor for Intermittent AI Inference Using Nonvolatile-SRAM-based Compute-In-Memory

Memristor-based Deep Spiking Neural Network with a Computing-In-Memory Architecture

RRAM-based Analog-Weight Spiking Neural Network Accelerator with In-Situ Learning for IoT Applications

NeuroNAS: A Framework for Energy-Efficient Neuromorphic Compute-in-Memory Systems using Hardware-Aware Spiking Neural Architecture Search

A 28-Nm 36 Kb SRAM CIM Engine with 0.173 $\mu $m$^{2}$ 4T1T Cell and Self-Load-0 Weight Update for AI Inference and Training Applications

An Energy Efficient Computing-in-Memory Accelerator With 1T2R Cell and Fully Analog Processing for Edge AI Applications

An 1.38nj/inference Clock-Free Mixed-Signal Neuromorphic Architecture Using ReL-PSP Function and Computing-in-Memory.

Cambricon-M: A Fibonacci-Coded Charge-Domain SRAM-Based CIM Accelerator for DNN Inference

RISC-V based Fully-Parallel SRAM Computing-in-Memory Accelerator with High Hardware Utilization and Data Reuse Rate

A 65 Nm 73 Kb SRAM-Based Computing-In-Memory Macro with Dynamic-Sparsity Controlling

A Novel CNFET SRAM-Based Compute-In-Memory for BNN Considering Chirality and Nanotubes

COMPASS: SRAM-Based Computing-in-Memory SNN Accelerator with Adaptive Spike Speculation

An Energy-Efficient Computing-in-Memory NN Processor with Set-Associate Blockwise Sparsity and Ping-Pong Weight Update

A Dual-Split 6T SRAM-Based Computing-in-Memory Unit-Macro With Fully Parallel Product-Sum Operation for Binarized DNN Edge Processors