Abstract:Spiking neural networks (SNNs) are promising alternatives to artificial neural networks (ANNs) since they are more realistic brain-inspired computing models. SNNs have sparse neuron firing over time, i.e., spatiotemporal sparsity; thus, they are helpful in enabling energy-efficient hardware inference. However, exploiting the spatiotemporal sparsity of SNNs in hardware leads to unpredictable and unbalanced workloads, degrading the energy efficiency. Compared to SNNs with simple fully connected structures, those extensive structures (e.g., standard convolutions, depthwise convolutions, and pointwise convolutions) can deal with more complicated tasks but lead to difficulties in hardware mapping. In this work, we propose a novel reconfigurable architecture, Cerebron, which can fully exploit the spatiotemporal sparsity in SNNs with maximized data reuse and propose optimization techniques to improve the efficiency and flexibility of the hardware. To achieve flexibility, the reconfigurable compute engine is compatible with a variety of spiking layers and supports inter-computing-unit (CU) and intra-CU reconfiguration. The compute engine can exploit data reuse and guarantee parallel data access when processing different convolutions to achieve memory efficiency. A two-step data sparsity exploitation method is introduced to leverage the sparsity of discrete spikes and reduce the computation time. Besides, an online channelwise workload scheduling strategy is designed to reduce the latency further. Cerebron is verified on image segmentation and classification tasks using a variety of state-of-the-art spiking network structures. Experimental results show that Cerebron has achieved at least $17.5\times $ prediction energy reduction and $20\times $ speedup compared with state-of-the-art field-programmable gate array (FPGA)-based accelerators.

Marmotini: A Weight Density Adaptation Architecture with Hybrid Compression Method for Spiking Neural Network

An Efficient Spiking Neural Network Accelerator with Sparse Weight.

An Event-driven Spiking Neural Network Accelerator with On-chip Sparse Weight

A Convolutional Spiking Neural Network Accelerator with the Sparsity-Aware Memory and Compressed Weights

A Spiking Neural Network Accelerator based on Ping-Pong Architecture with Sparse Spike and Weight

ESSA: Design of a Programmable Efficient Sparse Spiking Neural Network Accelerator

Spike Trains Encoding Optimization for Spiking Neural Networks Implementation in FPGA

A Sparsity-Adapted Hardware Implementation of SNN for Cortical Spike Trains Decoding

Efficient Structure Slimming for Spiking Neural Networks

30.2 A 22nm 0.26nW/Synapse Spike-Driven Spiking Neural Network Processing Unit Using Time-Step-First Dataflow and Sparsity-Adaptive In-Memory Computing

A 4096-Neuron 1M-Synapse 3.8-pJ/SOP Spiking Neural Network With On-Chip STDP Learning and Sparse Weights in 10-nm FinFET CMOS

NBSSN: A Neuromorphic Binary Single-Spike Neural Network for Efficient Edge Intelligence.

An End-to-End SoC for Brain-Inspired CNN-SNN Hybrid Applications

A Hybrid Heterogeneous Neural Network Accelerator Based on Systolic Array

SpiDR: A Reconfigurable Digital Compute-in-Memory Spiking Neural Network Accelerator for Event-based Perception

Design of Multi-core Spiking Neural Network Chip Based on Butterfly Network

Exploring the Sparsity-Quantization Interplay on a Novel Hybrid SNN Event-Driven Architecture

A Scatter-and-Gather Spiking Convolutional Neural Network on a Reconfigurable Neuromorphic Hardware

Resource Constrained Model Compression via Minimax Optimization for Spiking Neural Networks

Cerebron: A Reconfigurable Architecture for Spatiotemporal Sparse Spiking Neural Networks

A 1.13μj/classification Spiking Neural Network Accelerator with a Single-Spike Neuron Model and Sparse Weights