DySurv: dynamic deep learning model for survival analysis with conditional variational inference

Munib Mesinovic,Peter Watkinson,Tingting Zhu
2024-11-23
Abstract:Machine learning applications for longitudinal electronic health records often forecast the risk of events at fixed time points, whereas survival analysis achieves dynamic risk prediction by estimating time-to-event distributions. Here, we propose a novel conditional variational autoencoder-based method, DySurv, which uses a combination of static and longitudinal measurements from electronic health records to estimate the individual risk of death dynamically. DySurv directly estimates the cumulative risk incidence function without making any parametric assumptions on the underlying stochastic process of the time-to-event. We evaluate DySurv on 6 time-to-event benchmark datasets in healthcare, as well as 2 real-world intensive care unit (ICU) electronic health records (EHR) datasets extracted from the eICU Collaborative Research (eICU) and the Medical Information Mart for Intensive Care database (MIMIC-IV). DySurv outperforms other existing statistical and deep learning approaches to time-to-event analysis across concordance and other metrics. It achieves time-dependent concordance of over 60% in the eICU case. It is also over 12% more accurate and 22% more sensitive than in-use ICU scores like Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA) scores. The predictive capacity of DySurv is consistent and the survival estimates remain disentangled across different datasets. Our interdisciplinary framework successfully incorporates deep learning, survival analysis, and intensive care to create a novel method for time-to-event prediction from longitudinal health records. We test our method on several held-out test sets from a variety of healthcare datasets and compare it to existing in-use clinical risk scoring benchmarks.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of how to make more accurate dynamic risk predictions in survival analysis, especially in electronic health record (EHR) data. Specifically, the authors propose a new conditional variational autoencoder (CVAE) method named **DySurv** for dynamically estimating the individual death risk from static and longitudinally measured electronic health records. #### Main problems: 1. **Limitations of existing methods**: - **Traditional statistical models**: For example, the Cox proportional hazards model has the problems of overly simple assumptions and strong restrictiveness, and it cannot capture complex time - varying features. - **Deep - learning models**: Although some deep - learning models can handle time - series data, they usually rely on parametric or semi - parametric assumptions of survival distributions, which may limit the prediction performance. 2. **The need for dynamic risk prediction**: - In clinical practice, doctors need to be able to dynamically predict a patient's future risk based on real - time data, rather than just predicting the risk at a fixed point in time. - Existing machine - learning methods are usually only able to predict risks at fixed time points and cannot provide continuous time - dependent predictions. #### Key points of the DySurv solution: - **Non - parametric method**: DySurv does not make any parametric assumptions about the underlying stochastic process and directly estimates the cumulative risk incidence function, thus avoiding the limitations in traditional models. - **Conditional variational autoencoder**: By introducing the conditional variational autoencoder, DySurv can extract latent features from high - dimensional, multi - modal longitudinal data and improve the prediction performance. - **Dynamic risk estimation**: DySurv can utilize static and time - series data to achieve a comprehensive learning of the patient's dynamic risk, thereby providing more accurate and personalized risk predictions. #### Application scenarios: - **Intensive care unit (ICU)**: DySurv is particularly useful in the ICU environment because it can handle high - frequency longitudinal measurement data and provide dynamic risk stratification of short - term health outcomes, helping with emergency prognosis and prevention. ### Summary: This paper solves the limitations of existing survival analysis methods in handling complex, high - dimensional electronic health record data by proposing the DySurv method, achieving more accurate and dynamic risk predictions. This not only improves the prediction accuracy but also provides a new tool for personalized medicine.