Delayed reinforcement learning converges to intermittent control for human quiet stance

Yongkun Zhao,Balint K Hodossy,Shibo Jing,Masahiro Todoh,Dario Farina
DOI: https://doi.org/10.1016/j.medengphy.2024.104197
Abstract:The neural control of human quiet stance remains controversial, with classic views suggesting a limited role of the brain and recent findings conversely indicating direct cortical control of muscles during upright posture. Conceptual neural feedback control models have been proposed and tested against experimental evidence. The most renowned model is the continuous impedance control model. However, when time delays are included in this model to simulate neural transmission, the continuous controller becomes unstable. Another model, the intermittent control model, assumes that the central nervous system (CNS) activates muscles intermittently, and not continuously, to counteract gravitational torque. In this study, a delayed reinforcement learning algorithm was developed to seek optimal control policy to balance a one-segment inverted pendulum model representing the human body. According to this approach, there was no a-priori strategy imposed on the controller but rather the optimal strategy emerged from the reward-based learning. The simulation results indicated that the optimal neural controller exhibits intermittent, and not continuous, characteristics, in agreement with the possibility that the CNS intermittently provides neural feedback torque to maintain an upright posture.
What problem does this paper attempt to address?