Abstract:Event-driven spiking neural networks (SNNs) have demonstrated significant potential for achieving high energy and area efficiency. However, existing SNN accelerators suffer from issues such as high latency and energy consumption due to serial accumulation-comparison operations. This is mainly because SNN neurons integrate spikes, accumulate membrane potential, and generate output spikes when the potential exceeds a threshold. To address this, one approach is to leverage the sparsity of SNN spikes to reduce the number of time steps. However, this method can result in imbalanced workloads among neurons and limit the utilization of processing elements (PEs). In this paper, we present SATO, a temporal-parallel SNN accelerator that enables parallel accumulation of membrane potential for all time steps. SATO adopts a two-stage pipeline methodology, effectively decoupling neuron computations. This not only maintains accuracy but also unveils opportunities for fine-grained parallelism. By dividing the neuron computation into distinct stages, SATO enables the concurrent execution of spike accumulation for each time step, leveraging the parallel processing capabilities of modern hardware architectures. This not only enhances the overall efficiency of the accelerator but also reduces latency by exploiting parallelism at a granular level. The architecture of SATO includes a novel binary adder-search tree for generating the output spike train, effectively decoupling the chronological dependence in the accumulation-comparison operation. Furthermore, SATO employs a bucket-sort-based method to evenly distribute compressed workloads to all PEs, maximizing data locality of input spike trains. Experimental results on various SNN models demonstrate that SATO outperforms the well-known accelerator, the 8-bit version of "Eyeriss" by 20.7× in terms of speedup and 6.0× energy-saving, on average. Compared to the state-of-the-art SNN accelerator "SpinalFlow", SATO can also achieve 4.6× performance gain and 3.1× energy reduction on average, which is quite impressive for inference.

A TTFS-based energy and utilization efficient neuromorphic CNN accelerator

You Only Spike Once: Improving Energy-Efficient Neuromorphic Inference to ANN-Level Accuracy

ETTFS: An Efficient Training Framework for Time-to-First-Spike Neuron

SPAT: FPGA-based Sparsity-Optimized Spiking Neural Network Training Accelerator with Temporal Parallel Dataflow

An Energy-Efficient Spiking Neural Network Accelerator Based on Spatio-Temporal Redundancy Reduction

Spike Trains Encoding and Threshold Rescaling Method for Deep Spiking Neural Networks

An Energy Efficient Residual Spiking Neural Network Accelerator with Ternary Spikes

Hierarchical Spiking-Based Model for Efficient Image Classification with Enhanced Feature Extraction and Encoding.

A Time-to-first-spike Coding and Conversion Aware Training for Energy-Efficient Deep Spiking Neural Network Processor Design

Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators using Time Compression Supporting Multiple Spike Codes

PT-Spike: A Precise-Time-Dependent Single Spike Neuromorphic Architecture with Efficient Supervised Learning

A Convolutional Spiking Neural Network Accelerator with the Sparsity-Aware Memory and Compressed Weights

Exploiting Temporal-Unrolled Parallelism for Energy-Efficient SNN Acceleration

FSpiNN: An Optimization Framework for Memory- and Energy-Efficient Spiking Neural Networks

NBSSN: A Neuromorphic Binary Single-Spike Neural Network for Efficient Edge Intelligence.

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training

30.2 A 22nm 0.26nW/Synapse Spike-Driven Spiking Neural Network Processing Unit Using Time-Step-First Dataflow and Sparsity-Adaptive In-Memory Computing

Reconsidering the energy efficiency of spiking neural networks

A Scatter-and-Gather Spiking Convolutional Neural Network on a Reconfigurable Neuromorphic Hardware

Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition

LC-TTFS: Towards Lossless Network Conversion for Spiking Neural Networks with TTFS Coding