Abstract:Spiking Neural Networks (SNNs) have indeed shown remarkable promise in the field of computer vision, emerging as a low-energy alternative to traditional Artificial Neural Networks (ANNs). However, SNNs also face several challenges: i) Existing SNNs are not purely additive and involve a substantial amount of floating-point computations, which contradicts the original design intention of adapting to neuromorphic chips; ii) The incorrect positioning of convolutional and pooling layers relative to spiking layers leads to reduced accuracy; iii) Leaky Integrate-and-Fire (LIF) neurons have limited capability in representing local information, which is disadvantageous for downstream visual tasks like semantic segmentation. To address the challenges in SNNs, i) we introduce Pure Sparse Self Attention (PSSA) and Dynamic Spiking Membrane Shortcut (DSMS), combining them to tackle the issue of floating-point computations; ii) the Spiking Precise Gradient downsampling (SPG-down) method is proposed for accurate gradient transmission; iii) the Group-LIF neuron concept is introduced to ensure LIF neurons' capability in representing local information both horizontally and vertically, enhancing their applicability in semantic segmentation tasks. Ultimately, these three solutions are integrated into the Powerful Sparse-Spike-Driven Transformer (PSSD-Transformer), effectively handling semantic segmentation tasks and addressing the challenges inherent in SNNs. The experimental results demonstrate that our model outperforms previous results on standard classification datasets and also shows commendable performance on semantic segmentation datasets. Up to this point, PSSD is the first model in the SNN field to perform semantic segmentation on large datasets. The code will be made publicly available after the paper is accepted for publication.

A lightweight siamese transformer for few-shot semantic segmentation

Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation

Multi-Similarity Enhancement Network for Few-Shot Segmentation.

Feature-Proxy Transformer for Few-Shot Segmentation

Adaptive Agent Transformer for Few-Shot Segmentation

Few-shot Semantic Segmentation with Support-induced Graph Convolutional Network

Cross-Domain Few-Shot Semantic Segmentation via Doubly Matching Transformation

Iterative Few-shot Semantic Segmentation from Image Label Text

Intermediate Prototype Mining Transformer for Few-Shot Semantic Segmentation

Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer

Dynamic Prototype Convolution Network for Few-Shot Semantic Segmentation

PSSD-Transformer: Powerful Sparse Spike-Driven Transformer for Image Semantic Segmentation

Focus on Query: Adversarial Mining Transformer for Few-Shot Segmentation

CATrans: Context and Affinity Transformer for Few-Shot Segmentation

Self-Support Few-Shot Semantic Segmentation

MSDNet: Multi-Scale Decoder for Few-Shot Semantic Segmentation via Transformer-Guided Prototyping

Few-Shot Segmentation Via Cycle-Consistent Transformer

Cycle association prototype network for few-shot semantic segmentation

Adaptive FSS: A Novel Few-Shot Segmentation Framework Via Prototype Enhancement

Few-Shot 3D Point Cloud Semantic Segmentation via Stratified Class-Specific Attention Based Transformer Network

Global–Local Query-Support Cross-Attention for Few-Shot Semantic Segmentation