Abstract:Animals can learn in real-life scenarios where rewards are often only available when a goal is achieved. This 'distal' or 'sparse' reward problem remains a challenge for conventional reinforcement learning algorithms. Here we investigate an algorithm for learning in such scenarios, inspired by the possibility that axo-axonal gap junction connections, observed in neural circuits with parallel fibres such as the insect mushroom body, could form a resistive network. In such a network, an active node represents the task state, connections between nodes represent state transitions and their connection to actions, and current flow to a target state can guide decision making. Building on evidence that gap junction weights are adaptive, we propose that experience of a task can modulate the connections to form a graph encoding the task structure. We demonstrate that the approach can be used for efficient reinforcement learning under sparse rewards, and discuss whether it is plausible as an account of the insect mushroom body. Learning in situations where reward is only rarely encountered is difficult. It is hard to discover the right sequence of actions when most actions, most of the time, provide no apparent progress towards a goal. Inspired by a neural circuit in the insect brain, and using direct electrical connections between neurons as well as synaptic connections, we present a new algorithm for learning. The model represents the states of the world with nodes and an electrical connection between two nodes is strengthened when the two corresponding states occur consecutively. The connections between nodes can also become associated to output actions that correlate with (hence are assumed to cause) transitions between states. When a particular goal is chosen or associated with a reward, for example, the target location in a navigation task, a flow of electrical current through the nodes will find the shortest path from the present state to the goal state and trigger the appropriate actions.

Neural Modulation for Reinforcement Learning in Developmental Networks Facing an Exponential No. of States

Inconsistent Training for Developmental Networks and the Applications in Game Agents

Mammalian epoxide hydrases: inducible enzymes catalysing the inactivation of carcinogenic and cytotoxic metabolites derived from aromatic and olefinic compounds.

Neural networks with motivation

Exploring unknown environments: motivated developmental learning for autonomous navigation of mobile robots

Evolving Self-Assembling Neural Networks: From Spontaneous Activity to Experience-Dependent Learning

Continual Learning with Deep Artificial Neurons

Reinforcement Learning with Brain-Inspired Modulation can Improve Adaptation to Environmental Changes

Reinforcement Learning in a Neurally Controlled Robot Using Dopamine Modulated STDP

Breaching the Bottleneck: Evolutionary Transition from Reward-Driven Learning to Reward-Agnostic Domain-Adapted Learning in Neuromodulated Neural Nets

A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations

Structural Credit Assignment with Coordinated Exploration

Reward is not Necessary: How to Create a Modular & Compositional Self-Preserving Agent for Life-Long Learning

Learning with sparse reward in a gap junction network inspired by the insect mushroom body

Lifelong Reinforcement Learning via Neuromodulation

A Computational Developmental Model of Perceptual Learning for Mobile Robot

IgA production without mu or delta chain expression in developing B cells.

Brain-like neural dynamics for behavioral control develop through reinforcement learning

On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

Neuron-level prediction and noise can implement flexible reward-seeking behavior

Temporal-Difference Learning Using Distributed Error Signals