Constrained Markov decision processes for response-adaptive procedures in clinical trials with binary outcomes

Stef Baas,Aleida Braaksma,Richard J. Boucherie
2024-01-29
Abstract:A constrained Markov decision process (CMDP) approach is developed for response-adaptive procedures in clinical trials with binary outcomes. The resulting CMDP class of Bayesian response -- adaptive procedures can be used to target a certain objective, e.g., patient benefit or power while using constraints to keep other operating characteristics under control. In the CMDP approach, the constraints can be formulated under different priors, which can induce a certain behaviour of the policy under a given statistical hypothesis, or given that the parameters lie in a specific part of the parameter space. A solution method is developed to find the optimal policy, as well as a more efficient method, based on backward recursion, which often yields a near-optimal solution with an available optimality gap. Three applications are considered, involving type I error and power constraints, constraints on the mean squared error, and a constraint on prior robustness. While the CMDP approach slightly outperforms the constrained randomized dynamic programming (CRDP) procedure known from literature when focussing on type I and II error and mean squared error, showing the general quality of CRDP, CMDP significantly outperforms CRDP when the focus is on type I and II error only.
Methodology,Optimization and Control
What problem does this paper attempt to address?
This paper mainly discusses how to design response adaptive procedures using Constrained Markov Decision Processes (CMDP) in binary outcome clinical trials. These procedures aim to optimize objectives such as patient benefit or statistical power while controlling certain operating characteristics such as Type I error and power. CMDP methods allow for the formulation of constraints under different priors to induce specific behaviors of strategies in a given statistical hypothesis or parameter space region. The paper proposes a method for solving optimal strategies and a more efficient approach based on backward recursion that can often find approximate optimal solutions and provide optimization gaps. The authors demonstrate the performance of CMDP in terms of Type I error and power constraints, mean squared error constraints, and control of misspecification on priors through three application examples, proving that CMDP is superior to the known Randomized Dynamic Programming (CRDP) method in certain situations.