Abstract:Spiking Neural Networks (SNNs) are expected to be a promising alternative to Artificial Neural Networks (ANNs) due to their strong biological interpretability and high energy efficiency. Specialized SNN hardware offers clear advantages over general-purpose devices in terms of power and performance. However, there's still room to advance hardware support for state-of-the-art (SOTA) SNN algorithms and improve computation and memory efficiency. As a further step in supporting high-performance SNNs on specialized hardware, we introduce FireFly v2, an FPGA SNN accelerator that can address the issue of non-spike operation in current SOTA SNN algorithms, which presents an obstacle in the end-to-end deployment onto existing SNN hardware. To more effectively align with the SNN characteristics, we design a spatiotemporal dataflow that allows four dimensions of parallelism and eliminates the need for membrane potential storage, enabling on-the-fly spike processing and spike generation. To further improve hardware acceleration performance, we develop a high-performance spike computing engine as a backend based on a systolic array operating at 500-600MHz. To the best of our knowledge, FireFly v2 achieves the highest clock frequency among all FPGA-based implementations. Furthermore, it stands as the first SNN accelerator capable of supporting non-spike operations, which are commonly used in advanced SNN algorithms. FireFly v2 has doubled the throughput and DSP efficiency when compared to our previous version of FireFly and it exhibits 1.33 times the DSP efficiency and 1.42 times the power efficiency compared to the current most advanced FPGA accelerators.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the insufficient ability of existing dedicated SNN (Spiking Neural Network) hardware to support the latest SNN algorithms, especially when dealing with non - spike operations. Specifically, the current SNN hardware designs cannot effectively support the following situations: 1. **Pixel operations in the direct encoding layer**: In direct input encoding, the initial convolutional layer uses analog pixel values, which are incompatible with the existing spike - based SNN hardware. 2. **Multi - bit spike operations in SEW - ResNet**: SEW - ResNet introduces non - spike operations through spike element - level summation, which appears as non - spike convolution in the next convolutional layer. 3. **Fractional - spike convolution introduced by the average pooling layer**: The average pooling function commonly used in SNN models introduces fractional - spike convolution, and the existing SNN hardware cannot support this operation. To address these challenges, the paper proposes FireFly v2, an FPGA - based SNN accelerator with the following features: - **Support for non - spike operations**: FireFly v2 can support direct encoding, spike element - level residual connections, and common average pooling operations. - **Support for multiple neural dynamics**: It supports different types of neuron models, such as IF (Integrate - and - Fire), LIF (Leaky Integrate - and - Fire), and RMP (Resonate - and - Fire). - **Arbitrary convolution configurations**: It supports different convolution kernel sizes, strides, and padding configurations. - **Spatio - temporal data flow**: It adopts a four - dimensional parallel data flow scheme, including input channel parallelism, output channel parallelism, pixel - level parallelism, and time - step parallelism. - **High - performance spike - computing engine**: It integrates a high - performance pipelined - array - based quantum - computing engine with an operating frequency of 500 - 600MHz. Through these improvements, FireFly v2 can not only better support the latest SNN algorithms but also significantly improve hardware performance, including throughput and DSP (Digital Signal Processor) efficiency.

FireFly v2: Advancing Hardware Support for High-Performance Spiking Neural Network with a Spatiotemporal FPGA Accelerator

FireFly: A High-Throughput Hardware Accelerator for Spiking Neural Networks with Efficient DSP and Memory Optimization

FireFly-S: Exploiting Dual-Side Sparsity for Spiking Neural Networks Acceleration with Reconfigurable Spatial Architecture

Spike Trains Encoding Optimization for Spiking Neural Networks Implementation in FPGA

SPAT: FPGA-based Sparsity-Optimized Spiking Neural Network Training Accelerator with Temporal Parallel Dataflow

A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator

Hardware-Software Co-optimised Fast and Accurate Deep Reconfigurable Spiking Inference Accelerator Architecture Design Methodology

An Event-driven Spiking Neural Network Accelerator with On-chip Sparse Weight

SATO: spiking neural network acceleration via temporal-oriented dataflow and architecture

A Sparsity-Adapted Hardware Implementation of SNN for Cortical Spike Trains Decoding

Enabling Efficient On-Edge Spiking Neural Network Acceleration with Highly Flexible FPGA Architectures

Hardware implementation of spiking neural networks on FPGA

Spiker+: a framework for the generation of efficient Spiking Neural Networks FPGA accelerators for inference at the edge

A Reconfigurable FPGA-based Spiking Neural Network Accelerator

An Efficient Spiking Neural Network Accelerator with Sparse Weight.

Adaptive Multi-Level Firing for Direct Training Deep Spiking Neural Networks

An FPGA Implementation of Deep Spiking Neural Networks for Low-Power and Fast Classification

Exploiting Temporal-Unrolled Parallelism for Energy-Efficient SNN Acceleration

Advancements in Algorithms and Neuromorphic Hardware for Spiking Neural Networks

To Spike or Not to Spike? A Quantitative Comparison of SNN and CNN FPGA Implementations

Algorithms for Fast Spiking Neural Network Simulation on FPGAs