Leveraging historical data to optimize the number of covariates and their explained variance in the analysis of randomized clinical trials.

Samuel Branders,Alvaro Pereira,Guillaume Bernard,Marie Ernst,Jamie Dananberg,Adelin Albert
DOI: https://doi.org/10.1177/09622802211065246
IF: 2.494
2021-12-13
Statistical Methods in Medical Research
Abstract:The amount of data collected from patients involved in clinical trials is continuously growing. All baseline patient characteristics are potential covariates that could be used to improve clinical trial analysis and power. However, the limited number of patients in phases I and II studies restricts the possible number of covariates included in the analyses. In this paper, we investigate the cost/benefit ratio of including covariates in the analysis of clinical trials with a continuous outcome. Within this context, we address the long-running question “What is the optimum number of covariates to include in a clinical trial?” To further improve the benefit/cost ratio of covariates, historical data can be leveraged to pre-specify the covariate weights, which can be viewed as the definition of a new composite covariate. Here we analyze the use of a composite covariate to improve the estimated treatment effect in small clinical trials. A composite covariate limits the loss of degrees of freedom and the risk of overfitting.
health care sciences & services,medical informatics,mathematical & computational biology,statistics & probability
What problem does this paper attempt to address?