Abstract:Spiking Neural Networks (SNNs) have indeed shown remarkable promise in the field of computer vision, emerging as a low-energy alternative to traditional Artificial Neural Networks (ANNs). However, SNNs also face several challenges: i) Existing SNNs are not purely additive and involve a substantial amount of floating-point computations, which contradicts the original design intention of adapting to neuromorphic chips; ii) The incorrect positioning of convolutional and pooling layers relative to spiking layers leads to reduced accuracy; iii) Leaky Integrate-and-Fire (LIF) neurons have limited capability in representing local information, which is disadvantageous for downstream visual tasks like semantic segmentation. To address the challenges in SNNs, i) we introduce Pure Sparse Self Attention (PSSA) and Dynamic Spiking Membrane Shortcut (DSMS), combining them to tackle the issue of floating-point computations; ii) the Spiking Precise Gradient downsampling (SPG-down) method is proposed for accurate gradient transmission; iii) the Group-LIF neuron concept is introduced to ensure LIF neurons' capability in representing local information both horizontally and vertically, enhancing their applicability in semantic segmentation tasks. Ultimately, these three solutions are integrated into the Powerful Sparse-Spike-Driven Transformer (PSSD-Transformer), effectively handling semantic segmentation tasks and addressing the challenges inherent in SNNs. The experimental results demonstrate that our model outperforms previous results on standard classification datasets and also shows commendable performance on semantic segmentation datasets. Up to this point, PSSD is the first model in the SNN field to perform semantic segmentation on large datasets. The code will be made publicly available after the paper is accepted for publication.

Ppednet: Pyramid Pooling Encoder-Decoder Network For Real-Time Semantic Segmentation

ELKPPNet: An Edge-aware Neural Network with Large Kernel Pyramid Pooling for Learning Discriminative Features in Semantic Segmentation

DPNet: Dual-Pyramid Semantic Segmentation Network Based on Improved Deeplabv3 Plus

FCPFNet: Feature Complementation Network with Pyramid Fusion for Semantic Segmentation

PPNet : Pooling Position Attention Network for Semantic Segmentation

A Method of Image Semantic Segmentation Based on PSPNet

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

PSSD-Transformer: Powerful Sparse Spike-Driven Transformer for Image Semantic Segmentation

DMPNet: Distributed Multi-Scale Pyramid Network for Real-Time Semantic Segmentation

MAPPING NEUTRAL HYDROGEN IN EXTERNAL GALAXIES

Reparameterizable Dual-Resolution Network for Real-time Semantic Segmentation

High-Resolution Remote Sensing Image Semantic Segmentation Method Based on Improved Encoder-Decoder Convolutional Neural Network

LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation

Encoder-decoder with double spatial pyramid for semantic segmentation.

EPRNet: Efficient Pyramid Representation Network for Real-Time Street Scene Segmentation

Shift Pooling PSPNet: Rethinking PSPNet for Building Extraction in Remote Sensing Images from Entire Local Feature Pooling

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

DPANET:Dual Pooling Attention Network for Semantic Segmentation

IIE-SegNet: Deep Semantic Segmentation Network With Enhanced Boundary Based on Image Information Entropy

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation

FBSNet: A Fast Bilateral Symmetrical Network for Real-Time Semantic Segmentation