A Model-Based Hybrid Soft Actor-Critic Deep Reinforcement Learning Algorithm for Optimal Ventilator Settings
Shaotao Chen,Xihe Qiu,Xiaoyu Tan,Zhijun Fang,Yaochu Jin
DOI: https://doi.org/10.1016/j.ins.2022.08.028
IF: 8.1
2022-01-01
Information Sciences
Abstract:A ventilator is a device that mechanically assists in pumping air into the lungs, which is a life-saving supportive therapy in an intensive care unit (ICU). In clinical scenarios, each patient has unique physiological circumstances and specific respiratory diseases, thus requiring individualized ventilator settings. Long-term supervision by experienced clinicians is essential to perform the task of precisely adjusting ventilator parameters and making timely modifications. Moreover, a tiny clinical error can result in severe lung injury, induce multi-system organ dysfunction, and increase mortality. To reduce the workload of clinicians and prevent medical errors, machine learning (ML), or more specifically, reinforcement learning (RL) methods, have been developed to automatically adjust the ventilator’s parameters and select optimal strategies. However, the ventilator settings contain both continuous (e.g., frequency) and discrete parameters (e.g., ventilation mode), making it challenging for conventional RL-based approaches to handle such problems. Meanwhile, it is necessary to develop models with high data efficiency to overcome medical data insufficiency. In this paper, we propose a model-based hybrid soft actor-critic (MHSAC) algorithm that is developed based on the classic soft actor-critic (SAC) and model-based policy optimization (MBPO) framework. This algorithm can learn both continuous and discrete policies according to the current and predictive state of patient’s physiological information with high data efficiency. Results reveal that our proposed model significantly outperforms the baseline models, achieving superior efficiency and high accuracy in the OpenAI Gym simulation environment. Our proposed model is capable of resolving mixed action space problems, enhancing data efficiency, and accelerating convergence, which can generate practical optimal ventilator settings, minimize possible medical errors, and provide clinical decision support.