Abstract:Deep Reinforcement Learning is an effective tool for drug dosing for chronic condition management. However, the final protocol is generally a black box without any justification for its prescribed doses. This paper addresses this issue by proposing an explainable dosing protocol for warfarin using a Proximal Policy Optimization method combined with Policy Distillation. We introduce Action Forging as an effective tool to achieve explainability. Our focus is on the maintenance dosing protocol. Results show that the final model is as easy to understand and deploy as the current dosing protocols and outperforms the baseline dosing algorithms.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the interpretability and performance of the warfarin maintenance - dose protocol. Specifically, the authors propose an interpretable model based on Deep Reinforcement Learning (DRL), aiming to provide a method for warfarin maintenance - dose that is both easy to understand and superior to existing dosing protocols. ### Problem Background Warfarin is a commonly used anticoagulant drug, but its dose adjustment is very complicated because patients' diet, lifestyle and genetic factors can all affect the drug's efficacy. In addition, the effective treatment range of warfarin is very narrow. Excessive use may lead to bleeding, while insufficient dose may cause thromboembolism. Therefore, finding the appropriate dose is crucial for patients' safety and efficacy. Most of the existing warfarin dosing protocols rely on clinical trial data and supervised learning methods, such as non - linear regression models. Although these methods perform well in predicting the initial dose, they still have limitations when adjusting the maintenance dose. In particular, although the DRL model is superior to traditional methods in performance, it is usually a "black box" and it is difficult to explain its decision - making process, which is unacceptable in the medical field. ### Core Contributions of the Paper To overcome the above problems, this paper proposes an interpretable deep reinforcement learning model that combines Proximal Policy Optimization (PPO) and Policy Distillation. This model improves interpretability through the following techniques: 1. **Action Forging**: By pre - processing the action space, the final policy is made easier to interpret. For example, reducing the frequency and number of dose - change options makes it more convenient for doctors to understand and use this model. 2. **Policy Distillation**: Convert the trained DRL model into a decision tree, thereby generating a form similar to the existing dosing tables for clinical application. ### Experimental Results The experimental results show that the proposed model is not only superior to the existing dosing protocols in performance, but also its output form is more intuitive and easier to understand. This is very important for clinicians because it can improve their trust and acceptance of the model while ensuring the treatment effect. In short, the goal of this paper is to develop an intelligent system that can not only effectively manage warfarin doses, but also clearly explain its decision - making process, in order to improve medical safety and the level of personalized treatment.

An Explainable Deep Reinforcement Learning Model for Warfarin Maintenance Dosing Using Policy Distillation and Action Forging

Optimizing Warfarin Dosing Using Contextual Bandit: An Offline Policy Learning and Evaluation Method

Estimation of Warfarin Dosage with Reinforcement Learning

Optimizing warfarin dosing for patients with atrial fibrillation using machine learning

Model Based Reinforcement Learning for Personalized Heparin Dosing

Optimizing the Dynamic Treatment Regime of In-Hospital Warfarin Anticoagulation in Patients after Surgical Valve Replacement Using Reinforcement Learning

On the Challenges of using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects

Development and Validation of Heparin Dosing Policies Using an Offline Reinforcement Learning Algorithm

Model-Informed Reinforcement Learning for Enabling Precision Dosing Via Adaptive Dosing

Online Learning to Estimate Warfarin Dose with Contextual Linear Bandits

Controlling Level of Unconsciousness by Titrating Propofol with Deep Reinforcement Learning

An ensemble learning based framework to estimate warfarin maintenance dose with cross-over variables exploration on incomplete data set

OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment

The application of a perceptron model to classify an individual's response to a proposed loading dose regimen of Warfarin

Development of A Novel Individualized Warfarin Dose Algorithm Based on A Population Pharmacokinetic Model with Improved Prediction Accuracy for Chinese Patients after Heart Valve Replacement

Building and analyzing machine learning-based warfarin dose prediction models using scikit-learn

Warfarin- A Natural Anticoagulant: A Review of Research Trends for Precision Medication

Deep learning identifies explainable reasoning paths of mechanism of action for drug repurposing from multilayer biological network

A Computer-Aided System for Determining the Application Range of a Warfarin Clinical Dosing Algorithm Using Support Vector Machines with a Polynomial Kernel Function

Estimate the Warfarin Dose by Ensemble of Machine Learning Algorithms

Warfarin dose estimation on multiple datasets with automated hyperparameter optimisation and a novel software framework