Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

Chenlin Zhou,Liutao Yu,Zhaokun Zhou,Zhengyu Ma,Han Zhang,Huihui Zhou,Yonghong Tian
2023-05-19
Abstract:Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks, due to their event-driven spiking computation. However, state-of-the-art deep SNNs (including Spikformer and SEW ResNet) suffer from non-spike computations (integer-float multiplications) caused by the structure of their residual connection. These non-spike computations increase SNNs' power consumption and make them unsuitable for deployment on mainstream neuromorphic hardware, which only supports spike operations. In this paper, we propose a hardware-friendly spike-driven residual learning architecture for SNNs to avoid non-spike computations. Based on this residual design, we develop Spikingformer, a pure transformer-based spiking neural network. We evaluate Spikingformer on ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS and DVS128 Gesture datasets, and demonstrate that Spikingformer outperforms the state-of-the-art in directly trained pure SNNs as a novel advanced backbone (75.85$\%$ top-1 accuracy on ImageNet, + 1.04$\%$ compared with Spikformer). Furthermore, our experiments verify that Spikingformer effectively avoids non-spike computations and significantly reduces energy consumption by 57.34$\%$ compared with Spikformer on ImageNet. To our best knowledge, this is the first time that a pure event-driven transformer-based SNN has been developed.
Neural and Evolutionary Computing,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the following issues: 1. **Non-spiking computation issue**: Existing deep Spiking Neural Networks (SNN) such as Spikformer and SEW ResNet have non-spiking computations (integer-floating point multiplication) in their residual connections. This not only limits their ability to fully leverage the energy efficiency advantages brought by event-driven computation but also makes it difficult to deploy and optimize their performance on mainstream neuromorphic hardware. 2. **Energy efficiency issue**: Due to the presence of non-spiking computations, existing SNN models cannot fully exploit their low-power advantages, resulting in higher energy consumption when implemented in hardware, which is close to that of traditional Artificial Neural Networks (ANN) with similar structures. To address the above issues, the authors propose a purely event-driven residual learning architecture—Spikingformer, which is a pure transformer-based SNN model. By avoiding non-spiking computations, Spikingformer not only improves model performance but also significantly reduces energy consumption, as demonstrated by experimental results on multiple datasets (such as ImageNet, CIFAR10, CIFAR100, CIFAR10-DVS, and DVS128 Gesture), showing its significant superiority over existing methods.