Abstract:Spiking Neural Networks (SNNs) have indeed shown remarkable promise in the field of computer vision, emerging as a low-energy alternative to traditional Artificial Neural Networks (ANNs). However, SNNs also face several challenges: i) Existing SNNs are not purely additive and involve a substantial amount of floating-point computations, which contradicts the original design intention of adapting to neuromorphic chips; ii) The incorrect positioning of convolutional and pooling layers relative to spiking layers leads to reduced accuracy; iii) Leaky Integrate-and-Fire (LIF) neurons have limited capability in representing local information, which is disadvantageous for downstream visual tasks like semantic segmentation. To address the challenges in SNNs, i) we introduce Pure Sparse Self Attention (PSSA) and Dynamic Spiking Membrane Shortcut (DSMS), combining them to tackle the issue of floating-point computations; ii) the Spiking Precise Gradient downsampling (SPG-down) method is proposed for accurate gradient transmission; iii) the Group-LIF neuron concept is introduced to ensure LIF neurons' capability in representing local information both horizontally and vertically, enhancing their applicability in semantic segmentation tasks. Ultimately, these three solutions are integrated into the Powerful Sparse-Spike-Driven Transformer (PSSD-Transformer), effectively handling semantic segmentation tasks and addressing the challenges inherent in SNNs. The experimental results demonstrate that our model outperforms previous results on standard classification datasets and also shows commendable performance on semantic segmentation datasets. Up to this point, PSSD is the first model in the SNN field to perform semantic segmentation on large datasets. The code will be made publicly available after the paper is accepted for publication.

Spikformer: When Spiking Neural Network Meets Transformer

Attention-free Spikformer: Mixing Spike Sequences with Simple Linear Transforms

SpikingResformer: Bridging ResNet and Vision Transformer in Spiking Neural Networks

Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket

Spikingformer: Spike-driven Residual Learning for Transformer-based Spiking Neural Network

TE-Spikformer:Temporal-enhanced spiking neural network with transformer

Spikeformer: Training high-performance spiking neural network with transformer

Spiking Transformer with Spatial-Temporal Attention

Efficient Structure Slimming for Spiking Neural Networks

QKFormer: Hierarchical Spiking Transformer using Q-K Attention

Auto-Spikformer: Spikformer architecture search

Scaling Spike-driven Transformer with Efficient Spike Firing Approximation Training

PSSD-Transformer: Powerful Sparse Spike-Driven Transformer for Image Semantic Segmentation

Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification

Spiking Wavelet Transformer

SparseSpikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer

Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips

Hierarchical Spiking-Based Model for Efficient Image Classification with Enhanced Feature Extraction and Encoding.

SpikeGraphormer: A High-Performance Graph Transformer with Spiking Graph Attention

CSNN: an Augmented Spiking Based Framework with Perceptron-Inception

Enhancing the Performance of Transformer-based Spiking Neural Networks by SNN-optimized Downsampling with Precise Gradient Backpropagation