Interpretable Prediction of Hospital Mortality in Bleeding Critically Ill Patients Based on Machine Learning and SHAP (Preprint)

Bingkui Ren,Yuping Zhang,Siying Chen,Jinglong Dai,Junci Chong,Yifei Zhong,Mengkai Deng,Shaobo Jiang,Zhigang Chang
DOI: https://doi.org/10.2196/preprints.66363
2024-01-01
Abstract:Hemorrhage is a prevalent and critical condition in the intensive care unit (ICU), marked by high incidence, elevated mortality rates, and substantial therapeutic challenges. Accurate prediction of mortality in patients with hemorrhage is essential for the development of personalized prevention and treatment strategies. Nevertheless, the implementation of effective predictive models in clinical practice remains limited, largely due to the current gap in robust and interpretable prediction tools. This study aimed to develop an interpretable model for predicting mortality risk in critically ill patients with hemorrhage in intensive care units (ICUs). The SHapley Additive exPlanation (SHAP) method was applied to interpret the extreme gradient boosting (XGBoost) model, allowing for the exploration of key prognostic factors in this patient population. In this retrospective cohort study, we developed and evaluated the performance of a predictive model using data from the eICU Collaborative Research Database (eICU-CRD). Data from the first 24 hours of each ICU admission were extracted, with the dataset randomly split into a training set (80%) and a validation set (20%). The predictive performance of the XGBoost model was compared to four other machine learning models using the area under the curve (AUC) as the metric. The SHapley Additive exPlanation (SHAP) method was employed to interpret the XGBoost model. Following initial validation, external validation was performed using data from a Chinese retrospective cohort, Refrain, which focuses on hemorrhage and coagulopathy in critically ill patients. A total of 10306 eligible patients with hemorrhage were included in the final cohort for this study. The observed in-hospital mortality of patients with hemorrhage was 11.5%. Comparatively, the XGBoost model had the highest predictive performance among five models with an area under the curve (AUC=0.81) , whereas LR had the poorest generalization ability (AUC=0.726). The decision curve showed that the net benefit of the XGBoost model surpassed those of other machine learning models at 10%~30% threshold probabilities. The SHAP method reveals the top 15 predictors of hemorrhage according to the importance ranking, and the bilirubin level was recognized as the most important predictor variable. Additionally, in the external validation using the REFRAIN cohort, the XGBoost model demonstrated robust predictive performance with an AUC of 0.776. The interpretable predictive model enhances the accuracy of mortality risk prediction in ICU patients with hemorrhage, enabling clinicians to devise more effective treatment plans and optimize resource allocation. Moreover, the interpretability framework increases model transparency, thereby facilitating clinicians' understanding and trust in the reliability of the predictive model.
What problem does this paper attempt to address?