Abstract:In this work, we examine recently developed methods for Bayesian inference of optimal dynamic treatment regimes (DTRs). DTRs are a set of treatment decision rules aimed at tailoring patient care to patient-specific characteristics, thereby falling within the realm of precision medicine. In this field, researchers seek to tailor therapy with the intention of improving health outcomes; therefore, they are most interested in identifying optimal DTRs. Recent work has developed Bayesian methods for identifying optimal DTRs in a family indexed by ψ via Bayesian dynamic marginal structural models (MSMs) (Rodriguez Duque D, Stephens DA, Moodie EEM, Klein MB. Semiparametric Bayesian inference for dynamic treatment regimes via dynamic regime marginal structural models. Biostatistics; 2022. (In Press)); we review the proposed estimation procedure and illustrate its use via the new BayesDTR R package. Although methods in Rodriguez Duque D, Stephens DA, Moodie EEM, Klein MB. (Semiparametric Bayesian inference for dynamic treatment regimes via dynamic regime marginal structural models. Biostatistics; 2022. (In Press)) can estimate optimal DTRs well, they may lead to biased estimators when the model for the expected outcome if everyone in a population were to follow a given treatment strategy, known as a value function, is misspecified or when a grid search for the optimum is employed. We describe recent work that uses a Gaussian process ( G P ) prior on the value function as a means to robustly identify optimal DTRs (Rodriguez Duque D, Stephens DA, Moodie EEM. Estimation of optimal dynamic treatment regimes using Gaussian processes; 2022. Available from: https://doi.org/10.48550/arXiv.2105.12259). We demonstrate how a G P approach may be implemented with the BayesDTR package and contrast it with other value-search approaches to identifying optimal DTRs. We use data from an HIV therapeutic trial in order to illustrate a standard analysis with these methods, using both the original observed trial data and an additional simulated component to showcase a longitudinal (two-stage DTR) analysis.

A Penalized Shared-parameter Algorithm for Estimating Optimal Dynamic Treatment Regimens

Penalized Q-Learning for Dynamic Treatment Regimes

Learning Optimal Dynamic Treatment Regimens Subject to Stagewise Risk Controls

Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes

Reinforcement Learning in Clinical Medicine: a Method to Optimize Dynamic Treatment Regime over Time.

Estimating Optimal Infinite Horizon Dynamic Treatment Regimes via pT-Learning

Stage-Aware Learning for Dynamic Treatments

Dynamic Treatment Regimes Using Bayesian Additive Regression Trees for Censored Outcomes

Robust Hybrid Learning for Estimating Personalized Dynamic Treatment Regimens

HIGH-DIMENSIONAL A-LEARNING FOR OPTIMAL DYNAMIC TREATMENT REGIMES

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

Set-valued dynamic treatment regimes for competing outcomes

A Smoothed Q‐learning Algorithm for Estimating Optimal Dynamic Treatment Regimes

Personalized Dynamic Treatment Regimes in Continuous Time: A Bayesian Approach for Optimizing Clinical Decisions with Timing

Stabilized Direct Learning for Efficient Estimation of Individualized Treatment Rules

DTR-Bench: An in silico Environment and Benchmark Platform for Reinforcement Learning Based Dynamic Treatment Regime

Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach

Identifying optimally cost-effective dynamic treatment regimes with a Q-learning approach

Constructing Stabilized Dynamic Treatment Regimes for Censored Data

Bayesian inference for optimal dynamic treatment regimes in practice