Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning

Simi Job,Xiaohui Tao,Lin Li,Haoran Xie,Taotao Cai,Jianming Yong,Qing Li
DOI: https://doi.org/10.1145/3643856
IF: 5
2024-02-01
ACM Transactions on Intelligent Systems and Technology
Abstract:Personalized clinical decision support systems are increasingly being adopted due to the emergence of data-driven technologies, with this approach now gaining recognition in critical care. The task of incorporating diverse patient conditions and treatment procedures into critical care decision-making can be challenging due to the heterogeneous nature of medical data. Advances in Artificial Intelligence (AI), particularly Reinforcement Learning (RL) techniques, enables the development of personalized treatment strategies for severe illnesses by using a learning agent to recommend optimal policies. In this study, we propose a Deep Reinforcement Learning (DRL) model with a tailored reward function and an LSTM-GRU-derived state representation to formulate optimal treatment policies for vasopressor administration in stabilizing patient physiological states in critical care settings. Using an ICU dataset and the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we focus on patients with Acute Respiratory Distress Syndrome (ARDS) that has led to Sepsis, to derive optimal policies that can prioritize patient recovery over patient survival. Both the DDQN ( RepDRL-DDQN ) and Dueling DDQN ( RepDRL-DDDQN ) versions of the DRL model surpass the baseline performance, with the proposed model’s learning agent achieving an optimal learning process across our performance measuring schemes. The robust state representation served as the foundation for enhancing the model’s performance, ultimately providing an optimal treatment policy focused on rapid patient recovery.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?
The main objective of this paper is to utilize Deep Reinforcement Learning (DRL) technology to formulate optimal treatment strategies for critically ill patients, particularly those with sepsis caused by Acute Respiratory Distress Syndrome (ARDS). Specifically, the research team aims to develop a deep reinforcement learning model that can recommend the best vasopressor administration strategy based on the patient's physiological state to stabilize their condition. To achieve this goal, the researchers proposed a deep reinforcement learning framework that combines Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRU). This framework first encodes the patient's multidimensional clinical time series data using the LSTM-GRU model to generate compact and informative state representations. These state representations are then fed into a deep reinforcement learning module based on Double Deep Q-Network (DDQN) or its dueling architecture version (DDDQN) to learn the optimal treatment policy. The main contributions of the study include: 1. Proposing a Recurrent Neural Network (RNN) based on a deep reinforcement learning architecture with a specially designed reward structure to formulate optimal treatment strategies that can prevent the patient's condition from deteriorating. 2. Developing a model that integrates important aspects of the patient's medical history, thereby improving the performance of the proposed deep reinforcement learning framework. 3. Constructing a framework that can formulate treatment strategies without requiring a large amount of sensitive patient information. 4. Experimental results show that the proposed model excels in recommending personalized and superior treatment strategies compared to traditional deep reinforcement learning techniques and has the potential to be extended to a broader critical care management framework. In summary, this study aims to formulate more personalized treatment strategies for critically ill patients through deep reinforcement learning technology to promote their rapid recovery.