An introduction to reinforcement learning for neuroscience

Kristopher T. Jensen
2024-08-02
Abstract:Reinforcement learning has a rich history in neuroscience, from early work on dopamine as a reward prediction error signal for temporal difference learning (Schultz et al., 1997) to recent work suggesting that dopamine could implement a form of 'distributional reinforcement learning' popularized in deep learning (Dabney et al., 2020). Throughout this literature, there has been a tight link between theoretical advances in reinforcement learning and neuroscientific experiments and findings. As a result, the theories describing our experimental data have become increasingly complex and difficult to navigate. In this review, we cover the basic theory underlying classical work in reinforcement learning and build up to an introductory overview of methods in modern deep reinforcement learning that have found applications in systems neuroscience. We start with an overview of the reinforcement learning problem and classical temporal difference algorithms, followed by a discussion of 'model-free' and 'model-based' reinforcement learning together with methods such as DYNA and successor representations that fall in between these two extremes. Throughout these sections, we highlight the close parallels between such machine learning methods and related work in both experimental and theoretical neuroscience. We then provide an introduction to deep reinforcement learning with examples of how these methods have been used to model different learning phenomena in systems neuroscience, such as meta-reinforcement learning (Wang et al., 2018) and distributional reinforcement learning (Dabney et al., 2020). Code that implements the methods discussed in this work and generates the figures is also provided.
Neurons and Cognition,Machine Learning
What problem does this paper attempt to address?
The paper primarily explores the application and theoretical advancements of reinforcement learning in neuroscience. Specifically, the paper attempts to address the following core issues: 1. **Basic Theories and Methods of Reinforcement Learning**: It first introduces classical reinforcement learning algorithms (such as temporal difference learning and Q-learning) and explains how these algorithms optimize behavior strategies through reward signals. 2. **Applications of Reinforcement Learning in Neuroscience**: It discusses in detail how reinforcement learning theory explains the activity patterns of the nervous system and demonstrates how the dopamine system drives the learning process as a reward prediction error signal. The paper particularly emphasizes the similarity between the activity of dopamine neurons and reinforcement learning algorithms. 3. **Applications of Modern Deep Reinforcement Learning**: It further explores the application of deep reinforcement learning methods in neuroscience, especially in understanding complex learning phenomena (such as meta-reinforcement learning and distributed reinforcement learning). These methods not only handle high-dimensional problems but also simulate complex decision-making processes. 4. **Differences Between Model-Free and Model-Based Reinforcement Learning and Their Neural Correlates**: It compares the differences between model-free and model-based reinforcement learning, discussing their respective advantages and limitations. Additionally, it introduces some methods that lie between the two (such as the DYNA algorithm and successor representation). 5. **Connections Between Reinforcement Learning Theory and Biological Neural Mechanisms**: It explores how reinforcement learning theory can be applied to explain the functions of specific brain regions (such as the prefrontal cortex and striatum) and proposes hypotheses on how these regions might gradually shift from model-free to model-based strategies. In summary, this paper aims to provide neuroscience researchers with a comprehensive framework of reinforcement learning theory and demonstrate how these theories help us better understand how organisms learn from experience and adapt to their environment.