Synergistic reinforcement learning by cooperation of the cerebellum and basal ganglia

Tatsumi Yohida,Hikaru Sugino,Hinako Yamamoto,Sho Tanno,Mikihide Tamura,Jun Igarashi,Yoshikazu Isomura,Riichiro Hira
DOI: https://doi.org/10.1101/2024.07.12.603330
2024-07-13
Abstract:The cerebral cortex, cerebellum, and basal ganglia play a central role in flexible learning in mammals. However, how these three structures work together is not fully understood. Recently, it has been suggested that reinforcement learning may be implemented not only in the basal ganglia but also in the cerebellum, as the activity of cerebellar climbing fibers represents reward prediction error. If the same learning mechanism via reward prediction error occurs simultaneously in the basal ganglia and cerebellum, it remains unclear how these two regions co-function. Here, we recorded neuronal activity in the output of cerebellum and basal ganglia, the cerebellar nuclei and substantia nigra pars reticulata, respectively, from ChR2 transgenic rats with high-density Neuropixels probes while optogenetically stimulating the cerebral cortex point-by-point. The temporal response patterns could be categorized into two classes in both cerebellar nuclei and substantia nigra pars reticulata. Among them, the fast excitatory response of the cerebellar nuclei due to the input of mossy fibers and the inhibitory response of the substantia nigra pars reticulata via the direct pathway were synchronized. This coincidence, reproduced in a spiking network simulation based on connectome data, was expected to synchronously activate the cerebral cortex via the thalamus. To further investigate the significance of this synchronous positive feedback, we constructed a reservoir model that mimics the time course of the activity dynamics of cerebral cortex and temporal responses of cerebellar nuclei and substantia nigra pars reticulata. Plasticity of both parallel fiber inputs to Purkinje cell and corticostriatal synapses onto the striatal neurons of the direct pathway was essential for successful learning of a reinforcement learning task. Notably, learning was inhibited when the timing of the cerebellar or basal ganglia output was delayed from the real data by 10 ms; the larger this delay, the slower the learning rate. This necessary temporal precision was observed only when the cerebral cortex operated in the β-to-γ frequency range. These results indicate that coordinated output of the cerebellum and basal ganglia, with input from the cerebral cortex in a narrow frequency band, facilitates brain-wide synergistic reinforcement learning. Thus, our findings contribute to a holistic understanding of the interactions among the cerebellum, basal ganglia, and cerebral cortex.
Neuroscience
What problem does this paper attempt to address?