GRU-D-Weibull: A novel real-time individualized endpoint prediction

Xiaoyang Ruan,Liwei Wang,Charat Thongprayoon,Wisit Cheungpasitporn,Hongfang Liu
DOI: https://doi.org/10.1016/j.artmed.2023.102696
IF: 7.011
2023-11-10
Artificial Intelligence in Medicine
Abstract:Background In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd ≈ 0.95) at CKD4 index date, and a minimum of ~0.45 year (sd ≈ 0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd ≈ 1.1) at index date and ~0.64 years (sd ≈ 0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes. (shorter version) Accurate prediction models for individual-level endpoints and time-to-endpoints are crucial in clinical practice. In this study, we propose a novel approach, GRU-D-Weibull, which combines gated recurrent units with decay (GRU-D) to model the Weibull distribution. Our method enables real-time individualized endpoint prediction and population-level risk management. Using a cohort of 6879 patients with stage 4 chronic kidney disease (CKD4), we evaluated the performance of GRU-D-Weibull in endpoint prediction. The C-index of GRU-D-Weibull was ~0.7 at the index date and increased to ~0.77 after 4.3 years of follow-up, similar to random survival forest. Our approach achieved an absolute L1-loss of ~1.1 years (SD ≈ 0.95) at the CKD4 index date and a minimum of ~0.45 years (SD ≈ 0.3) at 4 years of follow-up, outperforming competing methods significantly. GRU-D-Weibull consistently constrained the predicted survival probability at the time of an event within a smaller and more fixed range compared to other models throughout the follow-up peri -Abstract Truncated-
engineering, biomedical,computer science, artificial intelligence,medical informatics
What problem does this paper attempt to address?