P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks

Malyaban Bal,Abhronil Sengupta
2024-10-04
Abstract:Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures, with their core computational framework primarily using the leaky integrate-and-fire (LIF) neuron model. However, the limited hidden state representation of LIF neurons, characterized by a scalar membrane potential, and sequential spike generation process, poses challenges for effectively developing scalable spiking models to address long-range dependencies in sequence learning tasks. In this study, we develop a scalable probabilistic spiking learning framework for long-range dependency tasks leveraging the fundamentals of state space models. Unlike LIF neurons that rely on the determinitic Heaviside function for a sequential process of spike generation, we introduce a SpikeSampler layer that samples spikes stochastically based on an SSM-based neuronal model while allowing parallel computations. To address non-differentiability of the spiking operation and enable effective training, we also propose a surrogate function tailored for the stochastic nature of the SpikeSampler layer. To enhance inter-neuron communication, we introduce the SpikeMixer block, which integrates spikes from neuron populations in each layer. This is followed by a ClampFuse layer, incorporating a residual connection to capture complex dependencies, enabling scalability of the model. Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks, encompassing the Long Range Arena benchmark, permuted sequential MNIST, and the Speech Command dataset and demonstrate sparse spiking pattern highlighting its computational efficiency.
Neural and Evolutionary Computing
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to address the challenges in handling long - range dependency tasks using spiking neural networks (SNNs). Specifically: 1. **Limitations of the traditional LIF neuron model**: - The traditional leaky integrate - and - fire (LIF) neuron model has limitations when dealing with long - range dependency tasks. The hidden state representation ability of LIF neurons is limited, mainly represented by a scalar membrane potential. - The sequential state update of LIF neurons and the spike generation process based on the deterministic Heaviside function make training complex, usually requiring computationally expensive methods such as backpropagation through time (BPTT). 2. **Requirements of long - range dependency tasks**: - Long - range dependency tasks are very important in sequence learning, such as natural language processing (NLP), speech recognition, etc. These tasks require the model to effectively capture and process information over long time spans. 3. **Computational efficiency and biological plausibility**: - Spiking neural networks (SNNs) have attracted attention due to their high computational efficiency and biological plausibility. However, existing SNN models often perform poorly when handling long - range dependency tasks, especially in terms of computational efficiency. ### Solutions To solve the above problems, the paper proposes a new SNN architecture based on the probabilistic spiking state - space model (P - SpikeSSM). The main innovations include: 1. **Probabilistic spiking state - space model (P - SpikeSSM)**: - A multi - dimensional hidden state is introduced instead of the traditional scalar membrane potential, thus providing more powerful representation ability. - The state - space model (SSM) is used to capture the temporal dependencies in the input spike sequences instead of the traditional real - valued data. 2. **SpikeSampler layer**: - A SpikeSampler layer is introduced, which is used for randomly sampling spikes for neuron models based on the SSM model, supports parallel computing, and avoids the bottleneck of sequential processing. 3. **SpikeMixer block**: - A SpikeMixer block is introduced, which is used to integrate the spikes from the previous layer of P - SpikeSSM neurons, promotes communication between neurons, and enhances the processing ability of the model. 4. **FuseClamp layer**: - A FuseClamp layer is introduced, which combines the input spikes and the output of SpikeMixer, and captures complex dependencies through residual connections, further improving the scalability of the model. ### Experimental results The paper verifies the effectiveness of the proposed P - SpikeSSM model on multiple long - range dependency tasks, including: - **Long Range Arena (LRA) benchmark**: - On tasks such as ListOps, Text Retrieval, and Image Pathfinder, the P - SpikeSSM model outperforms traditional non - spiking models and some spiking models. - **Permuted Sequential MNIST (psMNIST)**: - On the psMNIST dataset, the P - SpikeSSM model achieves performance comparable to the current best non - spiking model and outperforms non - spiking Transformer architectures. - **Speech Command (SC10)**: - On the SC10 dataset, the P - SpikeSSM model outperforms many contemporary non - spiking architectures. ### Conclusion By introducing the probabilistic spiking state - space model (P - SpikeSSM), the paper successfully addresses the limitations of the traditional LIF neuron model in handling long - range dependency tasks while maintaining computational efficiency and biological plausibility. The experimental results show that the P - SpikeSSM model performs well on multiple long - range dependency tasks and has broad application prospects.