A Spiking Neural Network Structure Implementing Reinforcement Learning

Mikhail Kiselev
2023-09-23
Abstract:At present, implementation of learning mechanisms in spiking neural networks (SNN) cannot be considered as a solved scientific problem despite plenty of SNN learning algorithms proposed. It is also true for SNN implementation of reinforcement learning (RL), while RL is especially important for SNNs because of its close relationship to the domains most promising from the viewpoint of SNN application such as robotics. In the present paper, I describe an SNN structure which, seemingly, can be used in wide range of RL tasks. The distinctive feature of my approach is usage of only the spike forms of all signals involved - sensory input streams, output signals sent to actuators and reward/punishment signals. Besides that, selecting the neuron/plasticity models, I was guided by the requirement that they should be easily implemented on modern neurochips. The SNN structure considered in the paper includes spiking neurons described by a generalization of the LIFAT (leaky integrate-and-fire neuron with adaptive threshold) model and a simple spike timing dependent synaptic plasticity model (a generalization of dopamine-modulated plasticity). My concept is based on very general assumptions about RL task characteristics and has no visible limitations on its applicability. To test it, I selected a simple but non-trivial task of training the network to keep a chaotically moving light spot in the view field of an emulated DVS camera. Successful solution of this RL problem by the SNN described can be considered as evidence in favor of efficiency of my approach.
Neural and Evolutionary Computing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to implement reinforcement learning (RL) in spiking neural networks (SNN). Although many learning algorithms for SNN have been proposed, achieving an effective and general - purpose learning mechanism in SNN remains an unsolved scientific problem, especially in terms of reinforcement learning. The authors of the paper propose a new SNN structure, which is designed to be widely applicable to RL tasks. It is characterized by the fact that all signals (including sensory input streams, output signals sent to actuators, and reward - punishment signals) exist in the form of spikes. In addition, the selected neuron / plasticity model is easy to implement on modern neural chips, such as Loihi. Specifically, the SNN structure described in the paper contains the following elements: - **Spiking neurons**: A generalized version of the LIFAT (Leaky Integrate - and - Fire Neuron with Adaptive Threshold) model is used. - **Simple time - based synaptic plasticity model**: This is a generalized version of dopamine - regulated plasticity. - **Unique signal processing method**: All signals are processed in the form of spikes, which is different from the traditional firing - rate - based information encoding and more in line with the characteristics of modern neural chips. To test the effectiveness of this SNN structure, the authors selected a simple but non - trivial task, that is, training the network to track randomly moving light spots through a simulated DVS camera. Successfully solving this RL problem can serve as evidence of the effectiveness of this method.