Implementing Online Reinforcement Learning with Clustering Neural Networks

James E. Smith
2024-02-29
Abstract:An agent employing reinforcement learning takes inputs (state variables) from an environment and performs actions that affect the environment in order to achieve some objective. Rewards (positive or negative) guide the agent toward improved future actions. This paper builds on prior clustering neural network research by constructing an agent with biologically plausible neo-Hebbian three-factor synaptic learning rules, with a reward signal as the third factor (in addition to pre- and post-synaptic spikes). The classic cart-pole problem (balancing an inverted pendulum) is used as a running example throughout the exposition. Simulation results demonstrate the efficacy of the approach, and the proposed method may eventually serve as a low-level component of a more general method.
Neural and Evolutionary Computing
What problem does this paper attempt to address?
The paper attempts to address the problem of using biologically plausible three-factor synaptic learning rules to achieve online reinforcement learning in reinforcement learning tasks. Specifically, the paper aims to achieve this goal by constructing an agent that adopts a Cluster Neural Network (ClNN) architecture, and this agent is capable of conducting simulation experiments on the classic cart-pole problem. The main contribution of the paper lies in proposing a new method that combines biological plausibility with reinforcement learning mechanisms and demonstrating the effectiveness and potential application value of this method in the cart-pole problem.