Multicenter Development and Prospective Validation of eCARTv5: A Gradient Boosted Machine Learning Early Warning Score

Matthew M. Churpek,Kyle A. Carey,Ashley Snyder,Christopher J. Winslow,Emily R. Gilbert,Nirav S. Shah,Brian W. Patterson,Majid Afshar,Alan Weiss,Devendra N. Amin,Deborah J. Rhodes,Dana P. Edelson
DOI: https://doi.org/10.1101/2024.03.18.24304462
2024-10-03
Abstract:OBJECTIVE: Early detection of clinical deterioration using machine learning early warning scores may improve outcomes. However, most implemented scores were developed using logistic regression, only underwent retrospective validation, and were not tested in important subgroups. Our objective was to develop and prospectively validate a gradient boosted machine model (eCARTv5) for identifying clinical deterioration on the wards. DESIGN: Multicenter retrospective and prospective observational study. SETTING: Inpatient admissions to the medical-surgical wards at seven hospitals in three health systems for model development (2006-2022) and at 21 hospitals from three health systems for retrospective (2009-2023) and prospective (2023-2024) external validation. PATIENTS: All adult patients hospitalized at each participating health system during the study years. INTERVENTIONS: None MEASUREMENTS AND MAIN RESULTS: Predictor variables (demographics, vital signs, documentation, and laboratory values) were used in a gradient boosted trees algorithm to predict intensive care unit transfer or death in the next 24 hours. The developed model (eCART) was compared to the Modified Early Warning Score (MEWS) and the National Early Warning Score (NEWS) using the area under the receiver operating characteristic curve (AUROC). The development cohort included 901,491 admissions, the retrospective validation cohort included 1,769,461 admissions, and the prospective validation cohort included 205,946 admissions. In retrospective validation, eCART had the highest AUROC (0.835; 95%CI 0.834, 0.835), followed by NEWS (0.766 (95%CI 0.766, 0.767)), and MEWS (0.704 (95%CI 0.703, 0.704)). eCART′s performance remained high (AUROC ≥0.80) across a range of patient demographics, clinical conditions, and during prospective validation. CONCLUSIONS: We developed eCART, which performed better than the NEWS and MEWS retrospectively, prospectively, and across a range of subgroups. These results served as the foundation for Food and Drug Administration clearance for its use in identifying deterioration in hospitalized ward patients.
Intensive Care and Critical Care Medicine
What problem does this paper attempt to address?