Interpretable Machine-Learning Model for Real-Time, Clustered Risk Factor Analysis of Sepsis and Septic Death in Critical Care

Zhengyu Jiang,Lulong Bo,Lei Wang,Yan Xie,Jianping Cao,Ying Yao,Wenbin Lu,Xiaoming Deng,Tao Yang,Jinjun Bian
DOI: https://doi.org/10.1016/j.cmpb.2023.107772
2022-01-01
Abstract:BACKGROUND AND OBJECTIVE:Interpretable and real-time prediction of sepsis and risk factor analysis could enable timely treatment by clinicians and improve patient outcomes. To develop an interpretable machine-learning model for the prediction and risk factor analysis of sepsis and septic death.METHODS:This is a retrospective observational cohort study based on the Medical Information Mart for Intensive Care (MIMIC-IV) dataset; 69,619 patients from the database were screened. The two outcomes include patients diagnosed with sepsis and the death of septic patients. Clinical variables from ICU admission to outcomes were analyzed: demographic data, vital signs, Glasgow Coma Scale scores, laboratory test results, and results for arterial blood gasses (ABGs). Model performance was compared using the area under the receiver operating characteristic curve (AUROC). Model interpretations were based on the Shapley additive explanations (SHAP), and the clustered analysis was based on the combination of K-means and dimensionality reduction algorithms of t-SNE and PCA.RESULTS:For the analysis of sepsis and septic death, 47,185 and 2480 patients were enrolled, respectively. The XGBoost model achieved a predictive value of area under the curve (AUC): 0.745 [0.731-0.759] for sepsis prediction and 0.8 [0.77, 0.828] for septic death prediction. The real-time prediction model was trained to predict by day and visualize the individual or combined risk factor effects on the outcomes based on SHAP values. Clustered analysis separated the two phenotypes with distinct risk factors among patients with septic death.CONCLUSION:The proposed real-time, clustered prediction model for sepsis and septic death exhibited superior performance in predicting the outcomes and visualizing the risk factors in a real-time and interpretable manner to distinguish and mitigate patient risks, thus promising immense potential in effective clinical decision making and comprehensive understanding of complex diseases such as sepsis.
What problem does this paper attempt to address?