A Common Longitudinal Intensive Care Unit data Format (CLIF) to enable multi-institutional federated critical illness research
Juan C Rojas,Patrick G Lyons,Kaveri Chhikara,Vaishvik Chaudhari,Sivasubramanium Bhavani,Muna Amir Nour,Kevin G Buell,Kevin D Smith,Catherine A Gao,Saki Amagai,Chengsheng Mao,Yuan Luo,Anna K Barker,Mark Nuppnau,Haley Beck,Rachel Baccile,Michael Hermsen,Zewei Liao,Brenna Park-Egan,Kyle A Carey,Xuan Han,Chad H Hochberg,Nicholas E Ingraham,William Fiske Parker
DOI: https://doi.org/10.1101/2024.09.04.24313058
2024-09-04
Abstract:Background: Critical illness, or acute organ failure requiring life support, threatens over five million American lives annually. Electronic health record (EHR) data are a source of granular information that could generate crucial insights into the nature and optimal treatment of critical illness. However, data management, security, and standardization are barriers to large-scale critical illness EHR studies.
Methods: A consortium of critical care physicians and data scientists from eight US healthcare systems developed the Common Longitudinal Intensive Care Unit (ICU) data Format (CLIF), an open-source database format that harmonizes a minimum set of ICU Data Elements for use in critical illness research. We created a pipeline to process adult ICU EHR data at each site. After development and iteration, we conducted two proof-of-concept studies with a federated research architecture: 1) an external validation of an in-hospital mortality prediction model for critically ill patients and 2) an assessment of 72-hour temperature trajectories and their association with mechanical ventilation and in-hospital mortality using group-based trajectory models.
Results: We converted longitudinal data from 94,356 critically ill patients treated in 2020-2021 (mean age 60.6 years [standard deviation 17.2], 30% Black, 7% Hispanic, 45% female) across 8 health systems and 33 hospitals into the CLIF format, The in-hospital mortality prediction model performed well in the health system where it was derived (0.81 AUC, 0.06 Brier score). Performance across CLIF consortium sites varied (AUCs: 0.74-0.83, Brier scores: 0.06-0.01), and demonstrated some degradation in predictive capability. Temperature trajectories were similar across health systems. Hypothermic and hyperthermic-slow-resolver patients consistently had the highest mortality.
Conclusions: CLIF facilitates efficient, rigorous, and reproducible critical care research. Our federated case studies showcase the potential of CLIF for disease sub-phenotyping and clinical decision-support tool evaluation. Future applications include pragmatic EHR-based trials, target trial emulations, foundational multi-modal AI models of critical illness, and real-time critical care quality dashboards.