Abstract:Spiking neural networks (SNNs) are considered as energy-efficient alternatives to deep neural networks (DNNs). By adopting event-driven information processing, SNNs can significantly reduce the computational demands associated with DNNs, while still achieving comparable performance. However, current SNNs primarily prioritize high accuracy and large sparsity by constructing complex neuron models that generate sparse spikes. Unfortunately, this approach results in low energy efficiency and high latency, posing a significant challenge for deploying SNNs at the edge. Furthermore, the dominant computation in SNNs, which involves spike-wise Add-Accumulate operations, is well-suited for process-in-memory (PIM) architectures. However, exploiting high parallel processing and spike sparsity in PIM-based SNN accelerators is challenging due to the irregularity and time dependency of spikes. To address these issues, this paper proposes LowPASS, an algorithm hardware co-design framework. LowPASS exploits inherently rich sparsity in SNN spike activity using an in-situ compute-enabled crossbar architectures for high energy efficiency and low latency without compromising accuracy. Firstly, we selectively merge multiple time steps into single-shot computations based on output sparsity, enabling speculative fast forwarding. This approach improves crossbar utilization by reducing idling during sparse spike processing. Secondly, we introduce a dedicated PIM-based architecture for SNNs, leveraging the large sparsity of SNNs for highly parallel and energy-efficient computation. LowPASS incorporates popcount-based circuits to efficiently handle merged time steps on the crossbar, maximizing spike sparsity utilization and crossbar parallelism. Evaluations show that LowPASS outperforms the previous ANN accelerator Eyeriss (8-bit version) by 6,860X in terms of energy saving. In comparison to the state-of-the-art SNN accelerator T2FSNN, LowPASS achieves a significant 2,147X energy reduction. Even compared with the sparse-aware PIM architecture, LowPASS still shows 5.09X energy efficiency gain. These results showcase the impressive performance and energy efficiency of LowPASS, making it a compelling choice for SNN inference tasks.

SparseNN: an Energy-Efficient Neural Network Accelerator Exploiting Input and Output Sparsity

An Energy-Efficient Spiking Neural Network Accelerator Based on Spatio-Temporal Redundancy Reduction

An Efficient Hardware Architecture for DNN Training by Exploiting Triple Sparsity

SparseNN: A Performance-Efficient Accelerator for Large-Scale Sparse Neural Networks

MF-DSNN:An Energy-efficient High-performance Multiplication-free Deep Spiking Neural Network Accelerator

A Precision-Scalable Deep Neural Network Accelerator with Activation Sparsity Exploitation

An Efficient Spiking Neural Network Accelerator with Sparse Weight.

SPAT: FPGA-based Sparsity-Optimized Spiking Neural Network Training Accelerator with Temporal Parallel Dataflow

An Event-driven Spiking Neural Network Accelerator with On-chip Sparse Weight

Procrustes: a Dataflow and Accelerator for Sparse Deep Neural Network Training

ESSA: Design of a Programmable Efficient Sparse Spiking Neural Network Accelerator

Exploring Resource-Aware Deep Neural Network Accelerator and Architecture Design

LRADNN: High-throughput and energy-efficient Deep Neural Network accelerator using Low Rank Approximation.

A Computing Efficient Hardware Architecture for Sparse Deep Neural Network Computing

Work-in-Progress: A High-performance FPGA Accelerator for Sparse Neural Networks

Design Space Exploration of Sparsity-Aware Application-Specific Spiking Neural Network Accelerators

LowPASS: A Low Power PIM-based Accelerator with Speculative Scheme for SNNs

Hardware Accelerator Design for Sparse DNN Inference and Training: A Tutorial

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design

A Data-Driven Asynchronous Neural Network Accelerator

Randomize and Match: Exploiting Irregular Sparsity for Energy Efficient Processing in SNNs