Abstract:Background In the era of healthcare digital transformation, using electronic health record (EHR) data to generate various endpoint estimates for active monitoring is highly desirable in chronic disease management. However, traditional predictive modeling strategies leveraging well-curated data sets can have limited real-world implementation potential due to various data quality issues in EHR data. Methods We propose a novel predictive modeling approach, GRU-D-Weibull, which models Weibull distribution leveraging gated recurrent units with decay (GRU-D), for real-time individualized endpoint prediction and population level risk management using EHR data. Experiments We systematically evaluated the performance and showcased the real-world implementability of the proposed approach through individual level endpoint prediction using a cohort of patients with chronic kidney disease stage 4 (CKD4). A total of 536 features including ICD/CPT codes, medications, lab tests, vital measurements, and demographics were retrieved for 6879 CKD4 patients. The performance metrics including C-index, L1-loss, Parkes' error, and predicted survival probability at time of event were compared between GRU-D-Weibull and other alternative approaches including accelerated failure time model (AFT), XGBoost based AFT (XGB(AFT)), random survival forest (RSF), and Nnet-survival. Both in-process and post-process calibrations were experimented on GRU-D-Weibull generated survival probabilities. Results GRU-D-Weibull demonstrated C-index of ~0.7 at index date, which increased to ~0.77 at 4.3 years of follow-up, comparable to that of RSF. GRU-D-Weibull achieved absolute L1-loss of ~1.1 years (sd ≈ 0.95) at CKD4 index date, and a minimum of ~0.45 year (sd ≈ 0.3) at 4 years of follow-up, comparing to second-ranked RSF of ~1.4 years (sd ≈ 1.1) at index date and ~0.64 years (sd ≈ 0.26) at 4 years. Both significantly outperform competing approaches. GRU-D-Weibull constrained predicted survival probability at time of event to smaller and more fixed range than competing models throughout follow-up. Significant correlations were observed between prediction error and missing proportions of all major categories of input features at index date (Corr ~0.1 to ~0.3), which faded away within 1 year after index date as more data became available. Through post training recalibration, we achieved a close alignment between the predicted and observed survival probabilities across multiple prediction horizons at different time points during follow-up. Conclusion GRU-D-Weibull shows advantages over competing methods in handling missingness commonly encountered in EHR data and providing both probability and point estimates for diverse prediction horizons during follow-up. The experiment highlights the potential of GRU-D-Weibull as a suitable candidate for individualized endpoint risk management, utilizing real-time clinical data to generate various endpoint estimates for monitoring. Additional research is warranted to evaluate the influence of different data quality aspects on prediction performance. Furthermore, collaboration with clinicians is essential to explore the integration of this approach into clinical workflows and evaluate its effects on decision-making processes and patient outcomes. (shorter version) Accurate prediction models for individual-level endpoints and time-to-endpoints are crucial in clinical practice. In this study, we propose a novel approach, GRU-D-Weibull, which combines gated recurrent units with decay (GRU-D) to model the Weibull distribution. Our method enables real-time individualized endpoint prediction and population-level risk management. Using a cohort of 6879 patients with stage 4 chronic kidney disease (CKD4), we evaluated the performance of GRU-D-Weibull in endpoint prediction. The C-index of GRU-D-Weibull was ~0.7 at the index date and increased to ~0.77 after 4.3 years of follow-up, similar to random survival forest. Our approach achieved an absolute L1-loss of ~1.1 years (SD ≈ 0.95) at the CKD4 index date and a minimum of ~0.45 years (SD ≈ 0.3) at 4 years of follow-up, outperforming competing methods significantly. GRU-D-Weibull consistently constrained the predicted survival probability at the time of an event within a smaller and more fixed range compared to other models throughout the follow-up peri -Abstract Truncated-

GRU-D-Weibull: A novel real-time individualized endpoint prediction

Discrimination, calibration, and point estimate accuracy of GRU-D-Weibull architecture for real-time individualized endpoint prediction

Assessing Statewide All-Cause Future One-Year Mortality: Prospective Study With Implications for Quality of Life, Resource Utilization, and Medical Futility.

Time-series deep survival prediction for hemodialysis patients using an attention-based Bi-GRU network

Endpoint prediction of heart failure using electronic health records

Data-driven, two-stage machine learning algorithm-based prediction scheme for assessing 1-year and 3-year mortality risk in chronic hemodialysis patients

Prediction of chronic kidney disease progression using recurrent neural network and electronic health records

Time-dependent LSTM for Survival Prediction and Patient Subtyping in Kidney Disease Trajectory

ESKD Risk Prediction Model in a Multicenter Chronic Kidney Disease Cohort in China: A Derivation, Validation, and Comparison Study.

Risk Projection for Time-to-Event Outcome Leveraging Summary Statistics With Source Individual-Level Data

Machine-learning-based Web system for the prediction of chronic kidney disease progression and mortality

Enhancing End Stage Renal Disease Outcome Prediction: A Multi-Sourced Data-Driven Approach

An Approach for Personalized Dynamic Assessment of Chronic Kidney Disease Progression Using Joint Model

Designing an Implementable Clinical Prediction Model for Near-Term Mortality and Long-Term Survival in Patients on Maintenance Hemodialysis

Individualized prediction of chronic kidney disease for the elderly in longevity areas in China: Machine learning approaches

Predicting the risks of kidney failure and death in adults with moderate to severe chronic kidney disease: multinational, longitudinal, population based, cohort study

Transformer-based time-to-event prediction for chronic kidney disease deterioration

Predicting chronic kidney disease progression using small pathology datasets and explainable machine learning models

Personalized Prediction of Long-Term Renal Function Prognosis Following Nephrectomy Using Interpretable Machine Learning Algorithms: Case-Control Study

Computational and Human Intelligence Methods for Constructing Practical Risk Prediction Models: An Application to Cardio-Renal Outcomes in Non-Diabetic CKD Patients

Prediction of all-cause mortality for chronic kidney disease patients using four models of machine learning