Abstract:Abstract Introduction Venous thromboembolism (VTE), including deep vein thrombosis (DVT) and pulmonary embolism (PE), is one of the main causes of preventable death in hospitals in the UK. Current clinical risk scores to predict mortality of patients with VTE are the pulmonary embolism severity index (PESI) and the simplified PESI (sPESI) which have similar predictive power. Purpose To evaluate the ability of machine learning algorithms to predict mortality in patients admitted with VTE and to compare their predictive capability with the sPESI score for 30-day mortality. Methods The BBC-VTE was a retrospective multicentre patient cohort established to determine clinical features and novel aspects of risk prediction for VTE (and VTE-related complications) in a contemporary cohort. We include a cohort of 1554 patients (mean age 65.6 years; 53% female) who represent all consecutive admissions with a final diagnosis of VTE to one of 3 regional hospitals in the West Midlands, UK during the years 2012–2014. The dataset was split into training (70%) and validation (30%) cohorts. We trained two tree-based models, Random Forests (RF) and XGBoost (XG), using 5-fold cross-validation on the training cohort to predict patient mortality. This was validated using the held-out validation cohort and compared to a simple logistic regression model. To provide a comparison with the sPESI score, we extracted a sub-group of patients (n=652) who had values for oxygen saturation, systolic blood pressure, heart rate, history of cancer, history of cardiopulmonary disease, and age. We used RF to determine the mortality prediction using: i) only the sPESI variables listed and; ii) all the clinical variables available to us. This was then compared against the standard sPESI prediction for this cohort. C-indices (AUC) were used for comparison. Results The c-indices for RF and XG using the full patient cohort were 0.85 [95% CI: 0.80 – 0.90] (Fig. 1a) and 0.82 [95% CI: 0.77 - 0.87], with the logistic regression c-index being 0.83 [95% CI: 0.78 – 0.88]. The reported sPESI c-index was significantly smaller (p<0.05) than the RF c-index (0.75 [95% CI: 0.69–0.80]). The most important features for prediction of mortality indicated by the RF algorithm are age, admission blood levels, discharge oral anticoagulation, and previous malignancy (Fig. 2). The sPESI score c-index for the subgroup of patients was found to be 0.72. In comparison, using RF with the same variables gives a significantly larger (p<0.05) c-index of 0.78 [95% CI: 0.73 – 0.83]. When using all clinical variables available the c-index increased to 0.85 [95% CI: 0.80 – 0.90] (Fig. 1b). Conclusion Application of machine learning using simple clinical variables in hospital settings can improve prediction of mortality post-VTE event above-and-beyond the current simplified PESI risk score. Prospective study is warranted to validate the algorithm on external datasets and to construct individualised risk predictions. Funding Acknowledgement Type of funding sources: None. Figure 1. ROC curve comparisons with sPESIFigure 2. Feature Importances

Massive external validation of a machine learning algorithm to predict pulmonary embolism in hospitalized patients

Predicting Pulmonary Embolism among Hospitalized Patients with Machine Learning Algorithms

Early Detection of Pulmonary Embolism in a General Patient Population Immediately Upon Hospital Admission Using Machine Learning to Identify New, Unidentified Risk Factors: Model Development Study

Construction and validation of risk prediction models for pulmonary embolism in hospitalized patients based on different machine learning methods

Establishment of Machine Learning-Based Tool for Early Detection of Pulmonary Embolism

Interpretable Machine Learning Approach for Predicting 30-Day Mortality of Critical Ill Patients with Pulmonary Embolism and Heart Failure: A Retrospective Study

Development and validation of a novel model to predict pulmonary embolism in cardiology suspected patients: A 10-year retrospective analysis

Improving Cardiovascular Risk Prediction Through Machine Learning Modelling of Irregularly Repeated Electronic Health Records

External validation of machine learning prediction model for pulmonary hypertension due to left heart disease

Machine learning prediction of mortality in venous thromboembolism patients: the Birmingham Black Country Venous Thromboembolism (BBC-VTE) cohort

A machine learning model for diagnosing acute pulmonary embolism and comparison with Wells score, revised Geneva score, and Years algorithm

Machine learning-based prediction of pulmonary embolism to reduce unnecessary computed tomography scans in gastrointestinal cancer patients: a retrospective multicenter study

At-admission prediction of mortality and pulmonary embolism in an international cohort of hospitalised patients with COVID-19 using statistical and machine learning methods

Development and Validation of a Natural Language Processing Model to Identify Low-Risk Pulmonary Embolism in Real Time to Facilitate Safe Outpatient Management

Machine Learning-Based Prediction of Pulmonary Embolism Prognosis Using Nutritional and Inflammatory Indices

Development and Validation of an ICU-Venous Thromboembolism Prediction Model Using Machine Learning Approaches: A Multicenter Study

Early Prediction of Ventilator-Associated Pneumonia in ICU Patients Using An Interpretable Machine Learning Algorithm

Abstract 12497: Pulmonary Embolism Mortality Prediction With Deep Learning Based on Computed Tomographic Pulmonary Angiography and Clinical Data

Development and validation of a prediction model to estimate risk of acute pulmonary embolism in deep vein thrombosis patients

Multicentre validation of a machine learning model for predicting respiratory failure after noncardiac surgery

At-Admission Prediction of Mortality and Pulmonary Embolism in COVID-19 Patients Using Statistical and Machine Learning Methods: An International Cohort Study