Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation

Michelle Pan,Mariah Schrum,Vivek Myers,Erdem Bıyık,Anca Dragan
2024-10-08
Abstract:Adaptive brain stimulation can treat neurological conditions such as Parkinson's disease and post-stroke motor deficits by influencing abnormal neural activity. Because of patient heterogeneity, each patient requires a unique stimulation policy to achieve optimal neural responses. Model-free reinforcement learning (MFRL) holds promise in learning effective policies for a variety of similar control tasks, but is limited in domains like brain stimulation by a need for numerous costly environment interactions. In this work we introduce Coprocessor Actor Critic, a novel, model-based reinforcement learning (MBRL) approach for learning neural coprocessor policies for brain stimulation. Our key insight is that coprocessor policy learning is a combination of learning how to act optimally in the world and learning how to induce optimal actions in the world through stimulation of an injured brain. We show that our approach overcomes the limitations of traditional MFRL methods in terms of sample efficiency and task success and outperforms baseline MBRL approaches in a neurologically realistic model of an injured brain.
Machine Learning,Artificial Intelligence,Human-Computer Interaction
What problem does this paper attempt to address?
The aim of this paper is to develop a model-based reinforcement learning method (namely Coprocessor Actor Critic, abbreviated as CopAC) to achieve adaptive brain stimulation for treating neurological diseases (such as Parkinson's disease, post-stroke motor disorders, etc.). Specifically, the paper focuses on the following aspects: 1. **Improving sample efficiency**: Traditional model-free reinforcement learning (MFRL) methods require a large number of environment interactions to learn effective brain stimulation strategies, which is impractical in real clinical applications due to the high cost of patient interactions and potential side effects. Therefore, the paper proposes a new model-based reinforcement learning method to overcome this limitation. 2. **Optimizing task success rate**: By decomposing policy learning into two stages—first learning how to perform optimal actions in the world in a simulated environment, and then learning online how to generate these optimal actions through stimulating the damaged brain—the learning algorithm's sample efficiency and task success rate are improved. 3. **Establishing physiologically and neurologically realistic models**: To validate the effectiveness of the proposed method, the paper constructs a physiologically and neurologically realistic simulation environment to evaluate the performance of adaptive brain stimulation strategies in stroke patients. This approach not only enhances the robustness and effectiveness of the algorithm but also lays the foundation for future research. In summary, this study aims to improve the therapeutic effects in the field of adaptive brain stimulation by introducing the CopAC method, making significant progress in reducing the number of required patient interactions and increasing the task success rate.