Fast and Interpretable Mortality Risk Scores for Critical Care Patients

Chloe Qinyu Zhu,Muhang Tian,Lesia Semenova,Jiachang Liu,Jack Xu,Joseph Scarpa,Cynthia Rudin
2023-11-22
Abstract:Prediction of mortality in intensive care unit (ICU) patients is an important task in critical care medicine. Prior work in creating mortality risk models falls into two major categories: domain-expert-created scoring systems, and black box machine learning (ML) models. Both of these have disadvantages: black box models are unacceptable for use in hospitals, whereas manual creation of models (including hand-tuning of logistic regression parameters) relies on humans to perform high-dimensional constrained optimization, which leads to a loss in performance. In this work, we bridge the gap between accurate black box models and hand-tuned interpretable models. We build on modern interpretable ML techniques to design accurate and interpretable mortality risk scores. We leverage the largest existing public ICU monitoring datasets, namely the MIMIC III and eICU datasets. By evaluating risk across medical centers, we are able to study generalization across domains. In order to customize our risk score models, we develop a new algorithm, GroupFasterRisk, which has several important benefits: (1) it uses hard sparsity constraint, allowing users to directly control the number of features; (2) it incorporates group sparsity to allow more cohesive models; (3) it allows for monotonicity correction on models for including domain knowledge; (4) it produces many equally-good models at once, which allows domain experts to choose among them. GroupFasterRisk creates its risk scores within hours, even on the large datasets we study here. GroupFasterRisk's risk scores perform better than risk scores currently used in hospitals, and have similar prediction performance to black box ML models (despite being much sparser). Because GroupFasterRisk produces a variety of risk scores and handles constraints, it allows design flexibility, which is the key enabler of practical and trustworthy model creation.
Machine Learning,Computers and Society
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the prediction of in - hospital mortality risk of patients in the intensive care unit (ICU). Specifically, the paper aims to develop a mortality - risk scoring system that is both accurate and interpretable to弥补 the deficiencies of the two existing types of mortality - risk models: scoring systems created by domain experts and black - box machine learning (ML) models. The former is lacking in performance, while the latter is not suitable for hospital use due to its lack of transparency. By introducing a new algorithm, GroupFasterRisk, the authors of the paper hope to improve the predictive accuracy of the model while ensuring its interpretability, and make the model more practical and reliable. ### Main contributions: 1. **Proposing the GroupFasterRisk algorithm**: This is an interpretable machine - learning algorithm that can automatically generate multiple high - quality risk scores. GroupFasterRisk not only optimizes feature selection, threshold setting for risk increase, and integer weight allocation, but also allows users to directly control the number of features, and can generate more coherent and interpretable models through group sparsity and monotonicity correction. 2. **Diverse high - precision risk scores**: Using GroupFasterRisk, the researchers provided multiple high - quality mortality - risk scores with different sparsity constraints and numbers of variables, which can be used to predict mortality risk in the intensive - care environment. 3. **Cross - domain generalization ability**: Through extensive out - of - distribution (OOD) testing in hospitals that did not participate in the training, it was proved that the risk scores generated by GroupFasterRisk are superior to the existing OASIS and SAPS II scores, and with fewer variables, their performance is comparable to or even better than that of APACHE IV/IVa. 4. **Performance in different sub - populations**: GroupFasterRisk performs better than or at least as well as the existing scoring systems in specific disease sub - populations such as sepsis, acute myocardial infarction, heart failure, and acute renal failure. 5. **Effectiveness of feature selection**: The research shows that the features selected by GroupFasterRisk are more informative than OASIS and can support the development of machine - learning models with higher predictive performance. Through the above contributions, the paper not only solves the limitations of the current mortality - risk scoring systems, but also provides a more accurate and reliable tool for clinical decision - making.