Abstract:An ongoing challenge in neural information processing is the following question: how do neurons adjust their connectivity to improve network-level task performance over time (i.e., actualize learning)? It is widely believed that there is a consistent, synaptic-level learning mechanism in specific brain regions, such as the basal ganglia, that actualizes learning. However, the exact nature of this mechanism remains unclear. Here, we investigate the use of universal synaptic-level algorithms in training connectionist models. Specifically, we propose an algorithm based on reinforcement learning (RL) to generate and apply a simple biologically-inspired synaptic-level learning policy for neural networks. In this algorithm, the action space for each synapse in the network consists of a small increase, decrease, or null action on the connection strength. To test our algorithm, we applied it to a multilayer perceptron (MLP) neural network model. This algorithm yields a static synaptic learning policy that enables the simultaneous training of over 20,000 parameters (i.e., synapses) and consistent learning convergence when applied to simulated decision boundary matching and optical character recognition tasks. The trained networks yield character-recognition performance comparable to identically shaped networks trained with gradient descent. The approach has two significant advantages in comparison to traditional gradient-descent-based optimization methods. First, the robustness of our novel method and its lack of reliance on gradient computations opens the door to new techniques for training difficult-to-differentiate artificial neural networks, such as spiking neural networks (SNNs) and recurrent neural networks (RNNs). Second, the method’s simplicity provides a unique opportunity for further development of local information-driven multiagent connectionist models for machine intelligence analogous to cellular automata.

Plateau Phenomenon in Gradient Descent Training of ReLU networks: Explanation, Quantification and Avoidance

The Disharmony between BN and ReLU Causes Gradient Explosion, but is Offset by the Correlation between Activations

Training a Two Layer ReLU Network Analytically

On Multi-Stage Loss Dynamics in Neural Networks: Mechanisms of Plateau and Descent Stages

Gradient Descent Provably Escapes Saddle Points in the Training of Shallow ReLU Networks

Understanding Multi-phase Optimization Dynamics and Rich Nonlinear Behaviors of ReLU Networks

The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models

Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data

Convergence Analysis of Two-layer Neural Networks with ReLU Activation

Normalized gradient flow optimization in the training of ReLU artificial neural networks

A proof of convergence for the gradient descent optimization method with random initializations in the training of neural networks with ReLU activation for piecewise linear target functions

Effective Rank and the Staircase Phenomenon: New Insights into Neural Network Training Dynamics

Non-convergence to global minimizers in data driven supervised deep learning: Adam and stochastic gradient descent optimization provably fail to converge to global minimizers in the training of deep neural networks with ReLU activation

Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks

Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training

Convergence proof for stochastic gradient descent in the training of deep neural networks with ReLU activation for constant target functions

Stochastic Gradient Descent Introduces an Effective Landscape-Dependent Regularization Favoring Flat Solutions

A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks

Gradient-Free Neural Network Training via Synaptic-Level Reinforcement Learning