Abstract:Purpose The existing sepsis treatment lacks effective reference and relies too much on the experience of clinicians. Therefore, we used the reinforcement learning model to build an assisted model for the sepsis medication treatment. Methods Using the latest Sepsis 3.0 diagnostic criteria, 19,582 sepsis patients were screened from the Medical Intensive Care Information III database for treatment strategy research, and forty-six features were used in modeling. The study object of the medication strategy is the dosage of vasopressor drugs and intravenous infusion. Dueling DDQN is proposed to predict the patient’s medication strategy (vasopressor and intravenous infusion dosage) through the relationship between the patient’s state, reward function, and medication action. We also constructed protection against the possible high-risk behaviors of Dueling DDQN, especially sudden dose changes of vasopressors can lead to harmful clinical effects. In order to improve the guiding effect of clinically effective medication strategies on the model, we proposed a hybrid model (safe-dueling DDQN + expert strategies) to optimize medication strategies. Results The Dueling DDQN medication model for sepsis patients is superior to clinical strategies and other models in terms of off-policy evaluation values and mortality, and reduced the mortality of clinical strategies from 16.8 to 13.8%. Safe-Dueling DDQN we proposed, compared with Dueling DDQN, has an overall reduction in actions involving vasopressors and reduces large dose fluctuations. The hybrid model we proposed can switch between expert strategies and safe dueling DDQN strategies based on the current state of patients. Conclusions The reinforcement learning model we proposed for sepsis medication treatment, has practical clinical value and can improve the survival rate of patients to a certain extent while ensuring the balance and safety of medication.

Optimizing Sequential Medical Treatments with Auto-Encoding Heuristic Search in POMDPs

The Actor Search Tree Critic (ASTC) for Off-Policy POMDP Learning in Medical Decision Making

Optimizing Medical Treatment for Sepsis in Intensive Care: from Reinforcement Learning to Pre-Trial Evaluation

Dynamic Programming for Solving a Simulated Clinical Scenario of Sepsis Resuscitation

Safe and Interpretable Estimation of Optimal Treatment Regimes

Deep Offline Reinforcement Learning for Real-world Treatment Optimization Applications

An optimal learning method for developing personalized treatment regimes

Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach

Personalized Dynamic Treatment Regimes in Continuous Time: A Bayesian Approach for Optimizing Clinical Decisions with Timing

Deep-Treat: Learning Optimal Personalized Treatments From Observational Data Using Neural Networks

Bayesian Sequential Optimal Experimental Design for Nonlinear Models Using Policy Gradient Reinforcement Learning

Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning

Pareto-Optimal Estimation and Policy Learning on Short-term and Long-term Treatment Effects

Safe Exploration for Optimization with Gaussian Processes

Experimenting on Markov Decision Processes with Local Treatments

Optimizing Sepsis Treatment Strategies Via a Reinforcement Learning Model

Adaptive Online Packing-guided Search for POMDPs

Optimal discharge of patients from intensive care via a data-driven policy learning framework

Sound Heuristic Search Value Iteration for Undiscounted POMDPs with Reachability Objectives

Learning Optimal Treatment Strategies for Sepsis Using Offline Reinforcement Learning in Continuous Space

Proximal Reinforcement Learning: Efficient Off-Policy Evaluation in Partially Observed Markov Decision Processes