Abstract:The cerebral cortex, cerebellum, and basal ganglia play a central role in flexible learning in mammals. However, how these three structures work together is not fully understood. Recently, it has been suggested that reinforcement learning may be implemented not only in the basal ganglia but also in the cerebellum, as the activity of cerebellar climbing fibers represents reward prediction error. If the same learning mechanism via reward prediction error occurs simultaneously in the basal ganglia and cerebellum, it remains unclear how these two regions co-function. Here, we recorded neuronal activity in the output of cerebellum and basal ganglia, the cerebellar nuclei and substantia nigra pars reticulata, respectively, from ChR2 transgenic rats with high-density Neuropixels probes while optogenetically stimulating the cerebral cortex point-by-point. The temporal response patterns could be categorized into two classes in both cerebellar nuclei and substantia nigra pars reticulata. Among them, the fast excitatory response of the cerebellar nuclei due to the input of mossy fibers and the inhibitory response of the substantia nigra pars reticulata via the direct pathway were synchronized. This coincidence, reproduced in a spiking network simulation based on connectome data, was expected to synchronously activate the cerebral cortex via the thalamus. To further investigate the significance of this synchronous positive feedback, we constructed a reservoir model that mimics the time course of the activity dynamics of cerebral cortex and temporal responses of cerebellar nuclei and substantia nigra pars reticulata. Plasticity of both parallel fiber inputs to Purkinje cell and corticostriatal synapses onto the striatal neurons of the direct pathway was essential for successful learning of a reinforcement learning task. Notably, learning was inhibited when the timing of the cerebellar or basal ganglia output was delayed from the real data by 10 ms; the larger this delay, the slower the learning rate. This necessary temporal precision was observed only when the cerebral cortex operated in the β-to-γ frequency range. These results indicate that coordinated output of the cerebellum and basal ganglia, with input from the cerebral cortex in a narrow frequency band, facilitates brain-wide synergistic reinforcement learning. Thus, our findings contribute to a holistic understanding of the interactions among the cerebellum, basal ganglia, and cerebral cortex.

A Computational Theory of Learning Flexible Reward-Seeking Behavior with Place Cells

A computational model of learning flexible navigation in a maze by layout-conforming replay of place cells

The cost of behavioral flexibility: reversal learning driven by a spiking neural network

A Model of Place Field Reorganization During Reward Maximization

Neuro-Inspired Reinforcement Learning to Improve Trajectory Prediction in Reward-Guided Behavior

How cortico-basal ganglia-thalamic subnetworks can shift decision policies to maximize reward rate

Reward Bases: A simple mechanism for adaptive acquisition of multiple reward types

Neuron-level prediction and noise can implement flexible reward-seeking behavior

A Neural Model of the Frontal Eye Fields with Reward-Based Learning.

Predictive Coding of Reward in the Hippocampus

Reward Coding of Hippocampal Neurons in Goal-directed Spatial Memory

Learning to express reward prediction error-like dopaminergic activity requires plastic representations of time

A theory of cerebral learning regulated by the reward system. I. Hypotheses and mathematical description

Learning with sparse reward in a gap junction network inspired by the insect mushroom body

A Cortical Microcircuit for Region-Specific Credit Assignment in Reinforcement Learning

A Cognitive Model Based on Neuromodulated Plasticity

Synergistic reinforcement learning by cooperation of the cerebellum and basal ganglia

A computational model of behavioral adaptation to solve the credit assignment problem

Dynamic reinforcement learning reveals time-dependent shifts in strategy during reward learning.

A Brain-Inspired Model of Hippocampal Spatial Cognition Based on a Memory-Replay Mechanism

Neural networks with motivation