Abstract:During development, neural circuits are shaped continuously as we learn to control our bodies. The ultimate goal of this process is to produce neural dynamics that enable the rich repertoire of behaviors we perform with our limbs. What begins as a series of "babbles" coalesces into skilled motor output as the brain rapidly learns to control the body. However, the nature of the teaching signal underlying this normative learning process remains elusive. Here, we test two well-established and biologically plausible theories-supervised learning (SL) and reinforcement learning (RL)-that could explain how neural circuits develop the capacity for skilled movements. We trained recurrent neural networks to control a biomechanical model of a primate arm using either SL or RL and compared the resulting neural dynamics to populations of neurons recorded from the motor cortex of monkeys performing the same movements. Intriguingly, only RL-trained networks produced neural activity that matched their biological counterparts in terms of both the geometry and dynamics of population activity. We show that the similarity between RL-trained networks and biological brains depends critically on matching biomechanical properties of the limb. We then demonstrated that monkeys and RL-trained networks, but not SL-trained networks, show a strikingly similar capacity for robust short-term behavioral adaptation to a movement perturbation, indicating a fundamental and general commonality in the neural control policy. Together, our results support the hypothesis that neural dynamics for behavioral control emerge through a process akin to reinforcement learning. The resulting neural circuits offer numerous advantages for adaptable behavioral control over simpler and more efficient learning rules and expand our understanding of how developmental processes shape neural dynamics.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is to understand how neural dynamics develop through Reinforcement Learning (RL) or Supervised Learning (SL) in behavior control. Specifically, researchers want to explore which learning mechanism - RL or SL - can better explain how the brain optimizes neural circuits through learning to achieve complex motor skills, such as the arm movement control of primates. ### Background of the Paper During the development process, neural circuits are continuously shaped, enabling us to learn to control our bodies. The goal of this process is to generate neural dynamics that can support the rich behaviors of our limbs. However, the nature of the teaching signals used to guide this normative learning process remains a mystery. This paper tests two established and biologically plausible theories - Supervised Learning (SL) and Reinforcement Learning (RL) - to explain how neural circuits develop the ability to perform complex actions. ### Research Methods Researchers used Recurrent Neural Networks (RNN) to simulate the biomechanical model of the primate arm and trained these networks through SL and RL respectively to perform the same behavioral tasks. Then, they compared the neural activities of these models with the neural cortical activities recorded from monkeys when performing the same movements, and evaluated the similarities from two aspects: geometric structure and dynamic characteristics. ### Main Findings 1. **Geometric Similarity**: Only the neural activities generated by the network trained by RL match the neural activities in organisms in terms of geometric structure, which is reflected in the high consistency in the results of Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) scores. 2. **Dynamic Similarity**: Dynamic Similarity Analysis (DSA) shows that the network trained by RL is also closer to the neural activities of monkeys in terms of dynamic characteristics, indicating that RL is not only similar to organisms in geometric structure but also has a higher similarity in dynamic behavior. 3. **Biomechanical Influence**: When the simulated effector is simplified from a complex biomechanical arm model to a simple point - mass model, the similarity between the network trained by RL and the organism decreases significantly. This indicates that the complexity of biomechanics has an important influence on the development of neural dynamics. 4. **Adaptability**: The network trained by RL shows stronger adaptability in the face of environmental changes such as visuomotor rotation, similar to the rapid adaptability shown by organisms in the actual environment. ### Conclusion The research results support the hypothesis that the neural dynamics in the brain develop through a process similar to Reinforcement Learning. This learning mechanism can not only generate neural activity patterns similar to organisms but also provides strong adaptability to environmental changes, which is crucial for the flexibility of behavior control. Through these findings, the paper expands our understanding of how the development process shapes the behavior - related neural dynamics and emphasizes the important role of Reinforcement Learning in this process.

Brain-like neural dynamics for behavioral control develop through reinforcement learning

From Data-Fitting to Discovery: Interpreting the Neural Dynamics of Motor Control through Reinforcement Learning

A virtual rodent predicts the structure of neural activity across behaviors

A multi-step neural control for motor brain-machine interface by reinforcement learning

Reinforcement Learning with Brain-Inspired Modulation can Improve Adaptation to Environmental Changes

Feedback control of recurrent dynamics constrains learning timescales during motor adaptation

Neural Implementation of Precise Temporal Patterns in Motor Cortex

A theory of brain-computer interface learning via low-dimensional control

Habit learning supported by efficiently controlled network dynamics in naive macaque monkeys

A Population-Level Analysis of Neural Dynamics in Robust Legged Robots

Adaptive Robotic Control Driven by a Versatile Spiking Cerebellar Network

Learning and Control in Motor Cortex across Cell Types and Scales

Neural Circuit Architectural Priors for Embodied Control

Neuro-musculoskeletal modeling reveals muscle-level neural dynamics of adaptive learning in sensorimotor cortex

Self-configuring feedback loops for sensorimotor control

A cerebellar model for predictive motor control tested in a brain-based device

Reward-driven adaptation of movements requires strong recurrent basal ganglia-cortical loops

Reconfigurations of cortical manifold structure during reward-based motor learning

Deep Reinforcement Learning for Neural Control

Neuromimetic Control -- A Linear Model Paradigm

Integrating across behaviors and timescales to understand the neural control of movement