Prajna G. Malettira,Shubham Negi,Wachirawit Ponghiran,Kaushik Roy
Abstract:Spiking Neural Networks (SNNs) with their bio-inspired Leaky Integrate-and-Fire (LIF) neurons inherently capture temporal information. This makes them well-suited for sequential tasks like processing event-based data from Dynamic Vision Sensors (DVS) and event-based speech tasks. Harnessing the temporal capabilities of SNNs requires mitigating vanishing spikes during training, capturing spatio-temporal patterns and enhancing precise spike timing. To address these challenges, we propose TSkips, augmenting SNN architectures with forward and backward skip connections that incorporate explicit temporal delays. These connections capture long-term spatio-temporal dependencies and facilitate better spike flow over long sequences. The introduction of TSkips creates a vast search space of possible configurations, encompassing skip positions and time delay values. To efficiently navigate this search space, this work leverages training-free Neural Architecture Search (NAS) to identify optimal network structures and corresponding delays. We demonstrate the effectiveness of our approach on four event-based datasets: DSEC-flow for optical flow estimation, DVS128 Gesture for hand gesture recognition and Spiking Heidelberg Digits (SHD) and Spiking Speech Commands (SSC) for speech recognition. Our method achieves significant improvements across these datasets: up to 18% reduction in Average Endpoint Error (AEE) on DSEC-flow, 8% increase in classification accuracy on DVS128 Gesture, and up to 8% and 16% higher classification accuracy on SHD and SSC, respectively.
What problem does this paper attempt to address?
### What problems does this paper attempt to solve?
This paper aims to solve several key challenges faced by Spiking Neural Networks (SNNs) when processing event - based data:
1. **Vanishing Spike Problem**: During the training process of SNNs, especially in long - time - series tasks, the problem of spike disappearance is likely to occur. This makes accurate spike propagation difficult and affects the learning effect of the model.
2. **Insufficient Capture of Spatio - Temporal Patterns**: Existing SNN architectures have difficulty effectively capturing long - term spatio - temporal dependencies, especially when processing complex data such as from Dynamic Vision Sensors (DVS) and event - based speech tasks.
3. **Precise Spike Timing Control**: In order to better process event - based data, more precise control of spike timing is required to improve the model's sensitivity and responsiveness to time information.
To solve these problems, the author proposes a new mechanism - **TSkips** (Temporal Skips), that is, by introducing explicit time - delay connections (forward and backward skip connections) to enhance the capabilities of the SNN architecture. These time - delay connections can span non - adjacent layers, thereby effectively capturing long - term spatio - temporal dependencies and improving the propagation of spikes in the network.
In addition, in order to efficiently determine the optimal TSkip configuration, the author also utilizes the training - free Neural Architecture Search (NAS) method to explore the best combinations of different architectures and time - delay values.
### Main Contributions
- **Introduction of TSkips**: Enhance SNN and hybrid ANN - SNN architectures through explicit time - delay connections, directly transmit spike information with time - delay, and effectively capture long - term spatio - temporal patterns.
- **Optimization of TSkip Configuration**: Use the NAS method to efficiently identify the optimal TSkip configuration under different architectures and sequence lengths, minimizing additional computational overhead.
- **Experimental Proof of Effectiveness**: Demonstrate significant performance improvements on multiple event - based datasets, including datasets such as DSEC - flow, DVS128 Gesture, SHD, and SSC, achieving significant improvements in optical flow estimation, gesture recognition, and speech recognition tasks.
Through these improvements, TSkips not only improves the performance of SNNs in processing complex spatio - temporal data but also maintains low model complexity and inference energy consumption.