Abstract:<p>Neuromorphic data, recording frameless spike events, have attracted considerable attention for the spatiotemporal information components and the event-driven processing fashion. Spiking neural networks (SNNs) represent a family of event-driven models with spatiotemporal dynamics for neuromorphic computing, which are widely benchmarked on neuromorphic data. Interestingly, researchers in the machine learning community can argue that recurrent (artificial) neural networks (RNNs) also have the capability to extract spatiotemporal features although they are not event-driven. Thus, the question of "what will happen if we benchmark these two kinds of models together on neuromorphic data" comes out but remains unclear.</p><p>In this work, we make a systematic study to compare SNNs and RNNs on neuromorphic data, taking the vision datasets as a case study. First, we identify the similarities and differences between SNNs and RNNs (including the vanilla RNNs and LSTM) from the modeling and learning perspectives. To improve comparability and fairness, we unify the supervised learning algorithm based on backpropagation through time (BPTT), the loss function exploiting the outputs at all timesteps, the network structure with stacked fully-connected or convolutional layers, and the hyper-parameters during training. Especially, given the mainstream loss function used in RNNs, we modify it inspired by the rate coding scheme to approach that of SNNs. Furthermore, we tune the temporal resolution of datasets to test model robustness and generalization. At last, a series of contrast experiments are conducted on two types of neuromorphic datasets: DVS-converted (N-MNIST) and DVS-captured (DVS Gesture). Extensive insights regarding recognition accuracy, feature extraction, temporal resolution and contrast, learning generalization, computational complexity and parameter volume are provided, which are beneficial for the model selection on different workloads and even for the invention of novel neural models in the future.</p>

Enhancing SNN-based Spatio-Temporal Learning: A Benchmark Dataset and Cross-Modality Attention Model

Event-Based Multimodal Spiking Neural Network with Attention Mechanism

A Spatial–Channel–Temporal-Fused Attention for Spiking Neural Networks

RSNN: Recurrent Spiking Neural Networks for Dynamic Spatial-Temporal Information Processing

CMCI: A Robust Multimodal Fusion Method for Spiking Neural Networks

Advancing Spiking Neural Networks towards Multiscale Spatiotemporal Interaction Learning

CSNN: an Augmented Spiking Based Framework with Perceptron-Inception

Spatial-Temporal Self-Attention for Asynchronous Spiking Neural Networks

TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks

An Event-based Categorization Model Using Spatio-temporal Features in a Spiking Neural Network.

Deep CovDenseSNN: A Hierarchical Event-Driven Dynamic Framework with Spiking Neurons in Noisy Environment

Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

STSC-SNN: Spatio-Temporal Synaptic Connection with temporal convolution and attention for spiking neural networks

STCA-SNN: Self-Attention-based Temporal-Channel Joint Attention for Spiking Neural Networks

Hierarchical Spiking-Based Model for Efficient Image Classification with Enhanced Feature Extraction and Encoding.

Event-based Action Recognition Using Motion Information and Spiking Neural Networks

Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences

Temporal-wise Attention Spiking Neural Networks for Event Streams Classification

Enhancing spiking neural networks with hybrid top-down attention

An Event-Driven Object Recognition Model Using Activated Connected Domain Detection