Patient stratification in multi-arm trials: a two-stage procedure with Bayesian profile regression

Yuejia Xu,Angela M. Wood,Brian D.M. Tom
DOI: https://doi.org/10.48550/arXiv.2302.11647
2023-02-23
Abstract:Precision medicine is an emerging field that takes into account individual heterogeneity to inform better clinical practice. In clinical trials, the evaluation of treatment effect heterogeneity is an important component, and recently, many statistical methods have been proposed for stratifying patients into different subgroups based on such heterogeneity. However, the majority of existing methods developed for this purpose focus on the case with a dichotomous treatment and are not directly applicable to multi-arm trials. In this paper, we consider the problem of patient stratification in multi-arm trial settings and propose a two-stage procedure within the Bayesian nonparametric framework. Specifically, we first use Bayesian additive regression trees (BART) to predict potential outcomes (treatment responses) under different treatment options for each patient, and then we leverage Bayesian profile regression to cluster patients into subgroups according to their baseline characteristics and predicted potential outcomes. We further embed a variable selection procedure into our proposed framework to identify the patient characteristics that actively "drive" the clustering structure. We conduct simulation studies to examine the performance of our proposed method and demonstrate the method by applying it to a UK-based multi-arm blood donation trial, wherein our method uncovers five clinically meaningful donor subgroups.
Methodology,Applications
What problem does this paper attempt to address?
This paper attempts to solve the problem of patient stratification in multi - arm trials. Specifically, the authors propose a two - stage procedure based on the Bayesian non - parametric framework for stratifying patients according to their baseline characteristics and potential treatment responses. This method aims to overcome the limitation that existing methods mainly focus on binary - treatment and make it applicable to multi - arm trials. ### Background and Objectives of the Paper With the development of precision medicine, it has become increasingly important to consider individual heterogeneity to guide clinical practice. In clinical trials, evaluating treatment - effect heterogeneity is a key component. However, most existing statistical methods focus on binary - treatment, and these methods cannot be directly applied to multi - arm trials. Therefore, the objective of this paper is to solve the patient stratification problem in the multi - arm trial setting and propose a new method to identify different subgroups of patients. ### Method Overview 1. **Stage 1: Predicting Potential Outcomes Using Bayesian Additive Regression Trees (BART)** - Use the BART model to predict the potential outcomes (treatment responses) of each patient under different treatment options. - The purpose of this stage is to generate an estimate of the potential treatment effect for each patient. 2. **Stage 2: Clustering Using Bayesian Profile Regression** - Use Bayesian profile regression to cluster patients according to their baseline characteristics and predicted potential outcomes. - The profile regression model is implemented through the Dirichlet Process Mixture Model (DPMM), which allows the number of clusters to be directly inferred from the data. - The model also incorporates a variable - selection feature to identify patient characteristics that "drive" the clustering structure. ### Main Contributions 1. **Clustering with Outcome Variables**: Unlike traditional unsupervised clustering methods, this method allows the outcome variable to influence cluster membership, ensuring that the clustering results are related to the target outcomes of clinical interest and are clinically meaningful. 2. **Automatically Inferring the Number of Clusters**: Using the Dirichlet process prior, the number of clusters can be directly inferred from the data, avoiding the difficulty of presetting the number of clusters in traditional clustering methods. 3. **Handling Correlated Features**: This method can handle correlated patient features without affecting the performance and interpretability of the model. 4. **Quantifying Uncertainty**: The method takes into account the uncertainty of the number of clusters and cluster assignments and can quantify the uncertainty of "representative" clusters through the model - averaging method. ### Application Examples The paper demonstrates the effectiveness of this method through simulation studies and practical applications (such as the INTERVAL trial). In the INTERVAL trial, this method successfully identified five clinically significant donor subgroups. The discovery of these subgroups is helpful for more effective donor recruitment and blood - collection - efficiency improvement in the future. In conclusion, this paper proposes an innovative two - stage Bayesian non - parametric method for patient stratification in multi - arm trials, which has important theoretical and practical application values.