Abstract:Abstract Introduction Venous thromboembolism (VTE), including deep vein thrombosis (DVT) and pulmonary embolism (PE), is one of the main causes of preventable death in hospitals in the UK. Current clinical risk scores to predict mortality of patients with VTE are the pulmonary embolism severity index (PESI) and the simplified PESI (sPESI) which have similar predictive power. Purpose To evaluate the ability of machine learning algorithms to predict mortality in patients admitted with VTE and to compare their predictive capability with the sPESI score for 30-day mortality. Methods The BBC-VTE was a retrospective multicentre patient cohort established to determine clinical features and novel aspects of risk prediction for VTE (and VTE-related complications) in a contemporary cohort. We include a cohort of 1554 patients (mean age 65.6 years; 53% female) who represent all consecutive admissions with a final diagnosis of VTE to one of 3 regional hospitals in the West Midlands, UK during the years 2012–2014. The dataset was split into training (70%) and validation (30%) cohorts. We trained two tree-based models, Random Forests (RF) and XGBoost (XG), using 5-fold cross-validation on the training cohort to predict patient mortality. This was validated using the held-out validation cohort and compared to a simple logistic regression model. To provide a comparison with the sPESI score, we extracted a sub-group of patients (n=652) who had values for oxygen saturation, systolic blood pressure, heart rate, history of cancer, history of cardiopulmonary disease, and age. We used RF to determine the mortality prediction using: i) only the sPESI variables listed and; ii) all the clinical variables available to us. This was then compared against the standard sPESI prediction for this cohort. C-indices (AUC) were used for comparison. Results The c-indices for RF and XG using the full patient cohort were 0.85 [95% CI: 0.80 – 0.90] (Fig. 1a) and 0.82 [95% CI: 0.77 - 0.87], with the logistic regression c-index being 0.83 [95% CI: 0.78 – 0.88]. The reported sPESI c-index was significantly smaller (p<0.05) than the RF c-index (0.75 [95% CI: 0.69–0.80]). The most important features for prediction of mortality indicated by the RF algorithm are age, admission blood levels, discharge oral anticoagulation, and previous malignancy (Fig. 2). The sPESI score c-index for the subgroup of patients was found to be 0.72. In comparison, using RF with the same variables gives a significantly larger (p<0.05) c-index of 0.78 [95% CI: 0.73 – 0.83]. When using all clinical variables available the c-index increased to 0.85 [95% CI: 0.80 – 0.90] (Fig. 1b). Conclusion Application of machine learning using simple clinical variables in hospital settings can improve prediction of mortality post-VTE event above-and-beyond the current simplified PESI risk score. Prospective study is warranted to validate the algorithm on external datasets and to construct individualised risk predictions. Funding Acknowledgement Type of funding sources: None. Figure 1. ROC curve comparisons with sPESIFigure 2. Feature Importances

Development and Validation of an ICU-Venous Thromboembolism Prediction Model Using Machine Learning Approaches: A Multicenter Study

A Risk Prediction Model for Efficient Intubation in the Emergency Department: A Five-Year Single-Center Retrospective Analysis

Prediction of Post-Stroke Urinary Tract Infection Risk in Immobile Patients Using Machine Learning: an Observational Cohort Study.

Prediction of Central Venous Catheter-Associated Deep Venous Thrombosis in Pediatric Critical Care Settings.

Development and Validation of a Risk Prediction Model for Venous Thromboembolism in Lung Cancer Patients Using Machine Learning

A Machine Learning Approach to Predict Deep Venous Thrombosis Among Hospitalized Patients

Prediction of Venous Thromboembolism in Diverse Populations Using Machine Learning and Structured Electronic Health Records

Construction and validation of risk prediction models for pulmonary embolism in hospitalized patients based on different machine learning methods

Machine Learning Predicts Cancer-Associated Deep Vein Thrombosis Using Clinically Available Variables

Prediction of large vessel occlusion for ischaemic stroke by using the machine learning model random forests

Machine learning-based prediction of the post-thrombotic syndrome: Model development and validation study

Early prognosis prediction for non-variceal upper gastrointestinal bleeding in the intensive care unit: based on interpretable machine learning

Comparing Different Venous Thromboembolism Risk Assessment Machine Learning Models in Chinese Patients

Development of a Risk Assessment Tool for Venous Thromboembolism among Hospitalized Patients in the ICU

Machine learning risk prediction model for bloodstream infections related to totally implantable venous access ports in patients with cancer

The Use of Machine Learning Techniques to Predict Deep Vein Thrombosis in Rehabilitation Inpatients

Development and Validation of a Clinical Prediction Model for Venous Thromboembolism Following Neurosurgery: A 6-Year, Multicenter, Retrospective and Prospective Diagnostic Cohort Study

Retrospective analysis of interpretable machine learning in predicting ICU thrombocytopenia in geriatric ICU patients

Machine learning prediction of mortality in venous thromboembolism patients: the Birmingham Black Country Venous Thromboembolism (BBC-VTE) cohort

Peripherally inserted central-related upper extremity deep vein thrombosis and machine learning

Machine learning-based prediction model of lower extremity deep vein thrombosis after stroke