Computational Phenomapping of Randomized Clinical Trials to Enable Assessment of their Real-world Representativeness and Personalized Inference

Phyllis M Thangaraj,Evangelos K Oikonomou,Lovedeep Singh Dhingra,Arya Aminorroaya,Rahul Jayaram,Marc A Suchard,Rohan Khera
DOI: https://doi.org/10.1101/2024.05.15.24306285
2024-05-16
Abstract:Importance: Randomized clinical trials (RCTs) are the standard for defining an evidence-based approach to managing disease, but their generalizability to real-world patients remains challenging to quantify. Objective: To develop a multidimensional patient variable mapping algorithm to quantify the similarity and representation of electronic health record (EHR) patients corresponding to an RCT and estimate the putative treatment effects in real-world settings based on individual treatment effects observed in an RCT. Design: A retrospective analysis of the Treatment of Preserved Cardiac Function Heart Failure with an Aldosterone Antagonist Trial (TOPCAT; 2006-2012) and a multi-hospital patient cohort from the electronic health record (EHR) in the Yale New Haven Hospital System (YNHHS; 2015-2023). Setting A multicenter international RCT (TOPCAT) and multi-hospital patient cohort (YNHHS). Participants: All TOPCAT participants and patients with heart failure with preserved ejection fraction (HFpEF) and ≥1 hospitalization within YNHHS. Exposures: 63 pre-randomization characteristics measured across the TOPCAT and YNNHS cohorts. Main Outcomes and Measures: Real-world generalizability of the RCT TOPCAT using a multidimensional phenotypic distance metric between TOPCAT and YNHHS cohorts. Estimation of the individualized treatment effect of spironolactone use on all-cause mortality within the YNHHS cohort based on phenotypic distance from the TOPCAT cohort. Results: There were 3,445 patients in TOPCAT and 11,712 HFpEF patients across five hospital sites. Across the 63 TOPCAT variables mapped by clinicians to the EHR, there were larger differences between TOPCAT and each of the 5 EHR sites (median SMD 0.200, IQR 0.037-0.410) than between the 5 EHR sites (median SMD 0.062, IQR 0.010-0.130). The synthesis of these differences across covariates using our multidimensional similarity score also suggested substantial phenotypic dissimilarity between the TOPCAT and EHR cohorts. By phenotypic distance, a majority (55%) of TOPCAT participants were closer to each other than any individual EHR patient. Using a TOPCAT-derived model of individualized treatment benefit from spironolactone, those predicted to derive benefit and receiving spironolactone in the EHR cohorts had substantially better outcomes compared with predicted benefit and not receiving the medication (HR 0.74, 95% CI 0.62-0.89). Conclusions and Relevance: We propose a novel approach to evaluating the real-world representativeness of RCT participants against corresponding patients in the EHR across the full multidimensional spectrum of the represented phenotypes. This enables the evaluation of the implications of RCTs for real-world patients.
What problem does this paper attempt to address?