Establishment and Validation of a Machine Learning-Based Prediction Model for Termination of Pregnancy via Cesarean Section
Weixuan Sheng,Wenpei Bai,Feiran Liu,Rui Zhang,Jin Zhang
DOI: https://doi.org/10.2147/IJGM.S413736
IF: 2.145
2023-11-01
International Journal of General Medicine
Abstract:Objective This study aimed to investigate the risk factors of cesarean section and establish a prediction model for cesarean section based on the characteristics of pregnant women. Methods The clinical characteristics of 2552 singleton pregnant women who delivered a live baby between January 2020 and December 2021 were retrospectively reviewed. They were divided into vaginal delivery group (n = 1850) and cesarean section group (n = 702). These subjects were divided into training set (2020.1–2021.6) and validation set (2021.7–2021.12). In the training set, univariate analysis, Lasso regression, and Boruta were used to screen independent risk factors for cesarean section. Four models, including Logistic Regression (LR), K-Nearest Neighbor (KNN), Classification and Regression Tree (CART), and Random forest (RF), were established in the training set using K-fold cross validation, hyperparameter optimization, and random oversampling techniques. The best model was screened, and Sort graph of feature variables, univariate partial dependency profile, and Break Down profile were delineated. In the validation set, the confusion matrix parameters were calculated, and receiver operating characteristic curve (ROC), precision recall curve (PRC), calibration curve, and clinical decision curve analysis (DCA) were delineated. Results The risk factors of cesarean section included age and height of women, weight at delivery, weight gain, para, assisted reproduction, abnormal blood glucose during pregnancy, pregnancy hypertension, scarred uterus, premature rupture of membrane (PROM), placenta previa, fetal malposition, thrombocytopenia, floating fetal head, and labor analgesia. RF had the best performance among the four models, and the accuracy of confusion matrix parameters was 0.8956357. The Matthews correlation coefficient (MCC) was 0.753012. The area under ROC (AUC-ROC) was 0.9790787, and the area under PRC (AUC-PRC) was 0.957888. Conclusion RF prediction model for caesarean section has high discrimination performance, accuracy and consistency, and outstanding generalization ability.
Computer Science,Medicine