Abstract:Spiking neural networks (SNNs) are increasingly applied to deep architectures. Recent works are developed to apply spatio-temporal backpropagation to directly train deep SNNs. But the binary and non-differentiable properties of spike activities force directly trained SNNs to suffer from serious gradient vanishing. In this paper, we first analyze the cause of the gradient vanishing problem and identify that the gradients mostly backpropagate along the synaptic currents. Based on that, we modify the synaptic current equation of leaky-integrate-fire neuron model and propose the improved LIF (IM-LIF) neuron model on the basis of the temporal-wise attention mechanism. We utilize the temporal-wise attention mechanism to selectively establish the connection between the current and historical response values, which can empirically enable the neuronal states to update resilient to the gradient vanishing problem. Furthermore, to capture the neuronal dynamics embedded in the output incorporating the IM-LIF model, we present a new temporal loss function to constrain the output of the network close to the target distribution. The proposed new temporal loss function could not only act as a regularizer to eliminate output outliers, but also assign the network loss credit to the voltage at a specific time point. Then we modify the ResNet and VGG architecture based on the IM-LIF model to build deep SNNs. We evaluate our work on image datasets and neuromorphic datasets. Experimental results and analysis show that our method can help build deep SNNs with competitive performance in both accuracy and latency, including 95.66% on CIFAR-10, 77.42% on CIFAR-100, 55.37% on Tiny-ImageNet, 97.33% on DVS-Gesture, and 80.50% on CIFAR-DVS with very few timesteps.

An Improvement on Recurrent Neural Network by Combining Convolution Neural Network and a Simple Initialization of the Weights

Advances in Convolutional Neural Networks

Residual Recurrent Neural Networks for Learning Sequential Representations.

Improving performance of recurrent neural network with relu nonlinearity

IM-LIF: Improved Neuronal Dynamics with Attention Mechanism for Direct Training Deep Spiking Neural Network

Weighted Residuals for Very Deep Networks

A Recalling-Enhanced Recurrent Neural Network: Conjugate Gradient Learning Algorithm and Its Convergence Analysis

Multiple-Weight Recurrent Neural Networks.

Improving Deep Neural Network with Multiple Parametric Exponential Linear Units

Recurrent Layer Aggregation Using LSTM.

Reversible Recurrent Neural Networks

Learning Longer Memory in Recurrent Neural Networks

Weight initialization based‐rectified linear unit activation function to improve the performance of a convolutional neural network model

Deep Residual Networks and Weight Initialization

Optimizing and Contrasting Recurrent Neural Network Architectures

Greedy Layer-Wise Training of Long Short Term Memory Networks

Independently Recurrent Neural Network (indrnn): Building a Longer and Deeper RNN.

Improving Learning Efficiency of Recurrent Neural Network Through Adjusting Weights of All Layers in a Biologically-Inspired Framework.

Isomorphic Model-Based Initialization for Convolutional Neural Networks

On the importance of network architecture in training very deep neural networks

An Improved Method and Application of Recurrent Convolutional Neural Network with Self-attention Mechanism