Abstract:Reinforcement learning has a rich history in neuroscience, from early work on dopamine as a reward prediction error signal for temporal difference learning (Schultz et al., 1997) to recent work suggesting that dopamine could implement a form of 'distributional reinforcement learning' popularized in deep learning (Dabney et al., 2020). Throughout this literature, there has been a tight link between theoretical advances in reinforcement learning and neuroscientific experiments and findings. As a result, the theories describing our experimental data have become increasingly complex and difficult to navigate. In this review, we cover the basic theory underlying classical work in reinforcement learning and build up to an introductory overview of methods in modern deep reinforcement learning that have found applications in systems neuroscience. We start with an overview of the reinforcement learning problem and classical temporal difference algorithms, followed by a discussion of 'model-free' and 'model-based' reinforcement learning together with methods such as DYNA and successor representations that fall in between these two extremes. Throughout these sections, we highlight the close parallels between such machine learning methods and related work in both experimental and theoretical neuroscience. We then provide an introduction to deep reinforcement learning with examples of how these methods have been used to model different learning phenomena in systems neuroscience, such as meta-reinforcement learning (Wang et al., 2018) and distributional reinforcement learning (Dabney et al., 2020). Code that implements the methods discussed in this work and generates the figures is also provided.

What problem does this paper attempt to address?

The paper primarily explores the application and theoretical advancements of reinforcement learning in neuroscience. Specifically, the paper attempts to address the following core issues: 1. **Basic Theories and Methods of Reinforcement Learning**: It first introduces classical reinforcement learning algorithms (such as temporal difference learning and Q-learning) and explains how these algorithms optimize behavior strategies through reward signals. 2. **Applications of Reinforcement Learning in Neuroscience**: It discusses in detail how reinforcement learning theory explains the activity patterns of the nervous system and demonstrates how the dopamine system drives the learning process as a reward prediction error signal. The paper particularly emphasizes the similarity between the activity of dopamine neurons and reinforcement learning algorithms. 3. **Applications of Modern Deep Reinforcement Learning**: It further explores the application of deep reinforcement learning methods in neuroscience, especially in understanding complex learning phenomena (such as meta-reinforcement learning and distributed reinforcement learning). These methods not only handle high-dimensional problems but also simulate complex decision-making processes. 4. **Differences Between Model-Free and Model-Based Reinforcement Learning and Their Neural Correlates**: It compares the differences between model-free and model-based reinforcement learning, discussing their respective advantages and limitations. Additionally, it introduces some methods that lie between the two (such as the DYNA algorithm and successor representation). 5. **Connections Between Reinforcement Learning Theory and Biological Neural Mechanisms**: It explores how reinforcement learning theory can be applied to explain the functions of specific brain regions (such as the prefrontal cortex and striatum) and proposes hypotheses on how these regions might gradually shift from model-free to model-based strategies. In summary, this paper aims to provide neuroscience researchers with a comprehensive framework of reinforcement learning theory and demonstrate how these theories help us better understand how organisms learn from experience and adapt to their environment.

An introduction to reinforcement learning for neuroscience

Deep Reinforcement Learning and its Neuroscientific Implications

Reinforcement Learning and its Connections with Neuroscience and Psychology

Prefrontal cortex as a meta-reinforcement learning system

A learning gap between neuroscience and reinforcement learning

Advanced Reinforcement Learning and Its Connections with Brain Neuroscience

Hierarchical reinforcement learning and decision making

Hierarchically organized behavior and its neural foundations: A reinforcement learning perspective

A distributional code for value in dopamine-based reinforcement learning

Reinforcement Learning in a Neurally Controlled Robot Using Dopamine Modulated STDP

Temporal-Difference Learning Using Distributed Error Signals

Reinforcement learning using a continuous time actor-critic framework with spiking neurons

The many worlds hypothesis of dopamine prediction error: implications of a parallel circuit architecture in the basal ganglia

Dopamine: A Research Framework for Deep Reinforcement Learning

A deep learning framework for neuroscience

Representation and Timing in Theories of the Dopamine System

The structure of reinforcement-learning mechanisms in the human brain

Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

Model-based reward prediction in the primate prefrontal cortex

An opponent striatal circuit for distributional reinforcement learning

Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey