Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

Chenlin Zhou,Liutao Yu,Zhaokun Zhou,Zhengyu Ma,Han Zhang,Huihui Zhou,Yonghong Tian

2023-05-19

Abstract:Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks, due to their event-driven spiking computation. However, state-of-the-art deep SNNs (including Spikformer and SEW ResNet) suffer from non-spike computations (integer-float multiplications) caused by the structure of their residual connection. These non-spike computations increase SNNs' power consumption and make them unsuitable for deployment on mainstream neuromorphic hardware, which only supports spike operations. In this paper, we propose a hardware-friendly spike-driven residual learning architecture for SNNs to avoid non-spike computations. Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network. We evaluate Spikingformer on ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS and DVS128 Gesture datasets, and demonstrate that Spikingformer outperforms the state-of-the-art in directly trained pure SNNs as a novel advanced backbone (75.85$\%$ top-1 accuracy on ImageNet, + 1.04$\%$ compared with Spikformer). Furthermore, our experiments verify that Spikingformer effectively avoids non-spike computations and significantly reduces energy consumption by 57.34$\%$ compared with Spikformer on ImageNet. To our best knowledge, this is the first time that a pure event-driven transformer-based SNN has been developed.

Neural and Evolutionary Computing,Artificial Intelligence

What problem does this paper attempt to address?

The paper aims to address the following issues: 1. **Non-spiking computation issue**: Existing deep Spiking Neural Networks (SNN) such as Spikformer and SEW ResNet have non-spiking computations (integer-floating point multiplication) in their residual connections. This not only limits their ability to fully leverage the energy efficiency advantages brought by event-driven computation but also makes it difficult to deploy and optimize their performance on mainstream neuromorphic hardware. 2. **Energy efficiency issue**: Due to the presence of non-spiking computations, existing SNN models cannot fully exploit their low-power advantages, resulting in higher energy consumption when implemented in hardware, which is close to that of traditional Artificial Neural Networks (ANN) with similar structures. To address the above issues, the authors propose a purely event-driven residual learning architecture—Spikingformer, which is a pure transformer-based SNN model. By avoiding non-spiking computations, Spikingformer not only improves model performance but also significantly reduces energy consumption, as demonstrated by experimental results on multiple datasets (such as ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS, and DVS128 Gesture), showing its significant superiority over existing methods.

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

Spikeformer: Training high-performance spiking neural network with transformer

Efficient Structure Slimming for Spiking Neural Networks

Spikformer: When Spiking Neural Network Meets Transformer

Spiking Deep Residual Networks.

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training

Spike Trains Encoding and Threshold Rescaling Method for Deep Spiking Neural Networks

TE-Spikformer:Temporal-enhanced spiking neural network with transformer

Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket

Spike Trains Encoding Optimization for Spiking Neural Networks Implementation in FPGA

An Efficient Learning Algorithm for Direct Training Deep Spiking Neural Networks

Towards Energy-Preserving Natural Language Understanding with Spiking Neural Networks

Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips

Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers

Spiking Wavelet Transformer

Combining Aggregated Attention and Transformer Architecture for Accurate and Efficient Performance of Spiking Neural Networks

Spiking Deep Residual Network

SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer

Efficient Deep Spiking Multilayer Perceptrons With Multiplication-Free Inference