Abstract:Recurrent neural networks (RNNs) hold immense potential for computations due to their Turing completeness and sequential processing capabilities, yet existing methods for their training encounter efficiency challenges. Backpropagation through time (BPTT), the prevailing method, extends the backpropagation (BP) algorithm by unrolling the RNN over time. However, this approach suffers from significant drawbacks, including the need to interleave forward and backward phases and store exact gradient information. Furthermore, BPTT has been shown to struggle with propagating gradient information for long sequences, leading to vanishing gradients. An alternative strategy to using gradient-based methods like BPTT involves stochastically approximating gradients through perturbation-based methods. This learning approach is exceptionally simple, necessitating only forward passes in the network and a global reinforcement signal as feedback. Despite its simplicity, the random nature of its updates typically leads to inefficient optimization, limiting its effectiveness in training neural networks. In this study, we present a new approach to perturbation-based learning in RNNs whose performance is competitive with BPTT, while maintaining the inherent advantages over gradient-based learning. To this end, we extend the recently introduced activity-based node perturbation (ANP) method to operate in the time domain, leading to more efficient learning and generalization. Subsequently, we conduct a range of experiments to validate our approach. Our results show similar performance, convergence time and scalability when compared to BPTT, strongly outperforming standard node perturbation and weight perturbation methods. These findings suggest that perturbation-based learning methods offer a versatile alternative to gradient-based methods for training RNNs which can be ideally suited for neuromorphic applications

An Augmented Lagrangian Method for Training Recurrent Neural Networks

RNN algorithm optimization based on extended unsaturated region

Residual Recurrent Neural Networks for Learning Sequential Representations.

Convergence of a Recurrent Neural Network for Nonconvex Optimization Based on an Augmented Lagrangian Function

A fast algorithm for solving large scale nonlinear optimization problems using RNN

Training Recurrent Neural Networks by Sequential Least Squares and the Alternating Direction Method of Multipliers

A discrete-time neural network for optimization problems with hybrid constraints.

ADMMiRNN: Training RNN with Stable Convergence Via an Efficient ADMM Approach

Gradient-Free Training of Recurrent Neural Networks using Random Perturbations

DRRNets: Dynamic Recurrent Routing Via Low-Rank Regularization in Recurrent Neural Networks.

Accelerated Levenberg-Marquardt Algorithm for Radial Basis Function Neural Network

Scalable Online Recurrent Learning Using Columnar Neural Networks

A New Recurrent Neural Network with Noise-Tolerance and Finite-Time Convergence for Dynamic Quadratic Minimization.

P-ADMMiRNN: Training RNN with Stable Convergence via An Efficient and Paralleled ADMM Approach

Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics

Two Recurrent Neural Networks With Reduced Model Complexity for Constrained l1-Norm Optimization

Path Space for Recurrent Neural Networks with ReLU Activations

Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks

An Augmented Lagrangian Method for Non-Lipschitz Nonconvex Programming

Alleviate Exposure Bias in Sequence Prediction \\ with Recurrent Neural Networks