Establishment and validation of a heart failure risk prediction model for elderly patients after coronary rotational atherectomy based on machine learning
Lixiang Zhang,Xiaojuan Zhou,Jiaoyu Cao
DOI: https://doi.org/10.7717/peerj.16867
IF: 3.061
2024-01-31
PeerJ
Abstract:Objective To develop and validate a heart failure risk prediction model for elderly patients after coronary rotational atherectomy based on machine learning methods. Methods A retrospective cohort study was conducted to select 303 elderly patients with severe coronary calcification as the study subjects. According to the occurrence of postoperative heart failure, the study subjects were divided into the heart failure group ( n = 53) and the non-heart failure group ( n = 250). Retrospective collection of clinical data from the study subjects during hospitalization. After processing the missing values in the original data and addressing sample imbalance using Adaptive Synthetic Sampling (ADASYN) method, the final dataset consists of 502 samples: 250 negative samples ( i.e ., patients not suffering from heart failure) and 252 positive samples ( i.e ., patients with heart failure). According to a 7:3 ratio, the datasets of 502 patients were randomly divided into a training set ( n = 351) and a validation set ( n = 151). On the training set, logistic regression (LR), extreme gradient boosting (XGBoost), support vector machine (SVM), and lightweight gradient boosting machine (LightGBM) algorithms were used to construct heart failure risk prediction models; Evaluate model performance on the validation set by calculating the area under the receiver operating characteristic curve (ROC) curve (AUC), sensitivity, specificity, positive predictive value, negative predictive value, F1-score, and prediction accuracy. Result A total of 17.49% of 303 patients occured postoperative heart failure. The AUC of LR, XGBoost, SVM, and LightGBM models in the training set were 0.872, 1.000, 0.699, and 1.000, respectively. After 10 fold cross validation, the AUC was 0.863, 0.972, 0.696, and 0.963 in the training set, respectively. Among them, XGBoost had the highest AUC and better predictive performance, while SVM models had the worst performance. The XGBoost model also showed good predictive performance in the validation set (AUC = 0.972, 95% CI [0.951–0.994]). The Shapley additive explanation (SHAP) method suggested that the six characteristic variables of blood cholesterol, serum creatinine, fasting blood glucose, age, triglyceride and NT-proBNP were important positive factors for the occurrence of heart failure, and LVEF was important negative factors for the occurrence of heart failure. Conclusion The seven characteristic variables of blood cholesterol, blood creatinine, fasting blood glucose, NT-proBNP, age, triglyceride and LVEF are all important factors affecting the occurrence of heart failure. The prediction model of heart failure risk for elderly patients after CRA based on the XGBoost algorithm is superior to SVM, LightGBM and the traditional LR model. This model could be used to assist clinical decision-making and improve the adverse outcomes of patients after CRA.
multidisciplinary sciences