Machine learning-based predictive models for perioperative major adverse cardiovascular events in patients with stable coronary artery disease undergoing non-cardiac surgery
Liang Shen,YunPeng Jin,AXiang Pan,Kai Wang,RunZe Ye,YangKai Lin,Safraz Anwar,WeiCong Xia,Min Zhou,XiaoGang Guo
DOI: https://doi.org/10.1101/2024.01.12.24301253
2024-01-01
Abstract:Background Machine learning (ML)-based predictive models for perioperative major adverse cardiovascular events (MACEs) in patients with stable coronary artery disease (SCAD) undergoing non-cardiac surgery (NCS) have not been reported before.
Methods Clinical data from 9171 consecutive adult patients with SCAD, who underwent NCS at the First Affiliated Hospital, Zhejiang University School of Medicine between January 2013 and May 2023, were used to develop and validate the prediction models. MACEs were defined as all-cause death, resuscitated cardiac arrest, myocardial infarction, heart failure and stroke perioperatively. Compare various resampling and feature selection methods to deal with data imbalance. A traditional logistic regression (the Revised Cardiac Risk index, RCRI) and nine ML models (logistic regression, support vector machine, Gaussian Naive Bayes, random forest, GBDT, XGBoost, LightGBM, CatBoost and best stacking ensemble model) were compared by the area under the receiver operating characteristic curve (AUROC) and the area under the precision recall curve (AUPRC). The calibration was assessed using the calibration curve and the patients’ net benefit was measured by decision curve analysis (DCA). Models were tested via 5-fold cross-validation. Feature importance was interpreted using SHapley Additive explanation (SHAP).
Results Among 9171 patients, 514 (5.6%) developed MACEs. The XGBoost performed best in terms of AUROC (0.898) and AUPRC (0.479),which were better than the RCRI of AUROC (0.716) and AUPRC (0.185), Delong test and Permutation test P<0.001, respectively. The calibration curve of XGBoost performance accurately predicted the risk of MACEs (brier score 0.040), the DCA results showed that the XGBoost had a high net benefit for predicting MACEs. The top-ranked stacking ensemble model consisting of CatBoost, GBDT, GNB, and LR proved to be the best, with an AUROC value of 0.894 (95% CI 0.860-0.928) and an AUPRC value of 0.485 (95% CI 0.383-0.587). Using the mean absolute SHAP values, we identified the top 20 important features.
Conclusion The first ML-based perioperative MACEs prediction models for patients with SCAD were successfully developed and validated. High-risk patients for MACEs can be effectively identified and targeted interventions can be made to reduce the incidence of MACEs.
Lay Summary We performed a retrospective machine learning classification study of MACEs in patients with SCAD undergoing non-cardiac surgery to develop and validate an optimal prediction model. In this study, we analyzed the data missing mechanism and identified the best missing data interpolation method, while applying appropriate resampling techniques and feature selection methods for data imbalance characteristics, and ultimately identified 24 preoperative features for building a machine learning predictive model. Eight independent machine learning prediction models and stacking ensemble models were built, and the models were evaluated comprehensively using ROC curve, PRC curve, calibration plots and DCA curve.
### Competing Interest Statement
The authors have declared no competing interest.
### Funding Statement
This work was supported by grants from the National Natural Science Foundation of China (82170331), Joint Funds from the National Natural Science Foundation of China (U21A20337), and grants from the Key Research and Development Plan of Zhejiang Province (2020C03017)
### Author Declarations
I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.
Yes
The details of the IRB/oversight body that provided approval or exemption for the research described are given below:
This study was approved by the Institutional Ethics Review Committee of the the First Affiliated Hospital, Zhejiang University School of Medicine (No. of ethical approval: IIT20230114A). Written informed consent was waived owing to the nature of the retrospective study design and the collected data was managed in a de-identified form.
I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.
Yes
I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).
Yes
I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.
Yes
* NCS
: non-cardiac surgery
SCAD
: stable coronary artery disease
MACEs
: major adverse cardiovascular events
MI
: myocardial infarction
HF
: heart failure
RCRI
: Revised Cardiac Risk index
ML
: Machine learning
AI
: artificial intelligence
ECG
: electrocardiogram
FAHZU
: First Affiliated Hospital, Zhejiang University School of Medicine
TRIPOD
: Transparent Reporting of Multivariable Prediction Models for Individual Prognosis or Diagnosis
STROBE
: STrengthening the Reporting of OBservational studies in Epidemiology
ACC
: American College of Cardiology
AHA
: American Heart Association
ICD-10
: International Classification of Diseases, Tenth Edition
BMI
: Body Mass Index
DOS
: duration of surgery
GA
: general anesthesia
AQW
: abnormal Q waves
ST-Ta
: ST-T wave abnormalities
LVEF
: left ventricular ejection fraction
RWMA
: regional wall motion abnormality
LVDD
: left ventricle diastolic dysfunction
PH
: pulmonary hypertension
Hb
: Hemoglobin
FBG
: Fasting blood glucose
Scr
: Creatinine
ASA PS
: American Society of Anesthesiologists Physical Status
IR
: imbalance ratio
AUROC
: area under the receiver operating characteristic curve
AUPRC
: area under the precision and recall curve
SMOTE
: Synthetic minority over-sampling technique
ADASYN
: adaptive synthetic
ENN
: Edited Nearest Neighbors
XGBoost
: eXtreme Gradient Boosting
CFS
: correlation-based feature selection
RFE
: recursive feature elimination
SHAP
: SHapley Additive exPlanation
LightGBM
: Light Gradient Boosting Machine
RF
: Random Forest
LR
: logistic regression
SVM
: support vector machine
GNB
: Gaussian Naïve Bayesian
GBDT
: gradient boosting decision tree
CatBoost
: categorical boosting
ROC
: receiver-operating characteristic
PRC
: curves and precision–recall curves
DCA
: decision curve analysis
SD
: standard deviation
IQR
: interquartile range
KNN
: k-Nearest Neighbor
IHD
: Ischemic heart disease
FS
: fractional shortening
LVDs
: left ventricular end systolic dimension
eGFR
: Estimated glomerular filtration rate
TSP
: Total serum protein
ALB
: Albumin
ChE
: Cholinesterase
TB
: Total bilirubin
tCa
: Total calcium
FB
: Fibrinogen
VIF
: variance inflation factor
MCAR
: missing completely at random
MAR
: missing at random
MNAR
: missing not at random