Prediction model of preeclampsia using machine learning based methods: a population based cohort study in China

Taishun Li,Mingyang Xu,Yuan Wang,Ya Wang,Huirong Tang,Honglei Duan,Guangfeng Zhao,Mingming Zheng,Yali Hu
DOI: https://doi.org/10.3389/fendo.2024.1345573
IF: 6.055
2024-06-12
Frontiers in Endocrinology
Abstract:Introduction: Preeclampsia is a disease with an unknown pathogenesis and is one of the leading causes of maternal and perinatal morbidity. At present, early identification of high-risk groups for preeclampsia and timely intervention with aspirin is an effective preventive method against preeclampsia. This study aims to develop a robust and effective preeclampsia prediction model with good performance by machine learning algorithms based on maternal characteristics, biophysical and biochemical markers at 11–13 + 6 weeks' gestation, providing an effective tool for early screening and prediction of preeclampsia. Methods: This study included 5116 singleton pregnant women who underwent PE screening and fetal aneuploidy from a prospective cohort longitudinal study in China. Maternal characteristics (such as maternal age, height, pre-pregnancy weight), past medical history, mean arterial pressure, uterine artery pulsatility index, pregnancy-associated plasma protein A, and placental growth factor were collected as the covariates for the preeclampsia prediction model. Five classification algorithms including Logistic Regression, Extra Trees Classifier, Voting Classifier, Gaussian Process Classifier and Stacking Classifier were applied for the prediction model development. Five-fold cross-validation with an 8:2 train-test split was applied for model validation. Results: We ultimately included 49 cases of preterm preeclampsia and 161 cases of term preeclampsia from the 4644 pregnant women data in the final analysis. Compared with other prediction algorithms, the AUC and detection rate at 10% FPR of the Voting Classifier algorithm showed better performance in the prediction of preterm preeclampsia (AUC=0.884, DR at 10%FPR=0.625) under all covariates included. However, its performance was similar to that of other model algorithms in all PE and term PE prediction. In the prediction of all preeclampsia, the contribution of PLGF was higher than PAPP-A (11.9% VS 8.7%), while the situation was opposite in the prediction of preterm preeclampsia (7.2% VS 16.5%). The performance for preeclampsia or preterm preeclampsia using machine learning algorithms was similar to that achieved by the fetal medicine foundation competing risk model under the same predictive factors (AUCs of 0.797 and 0.856 for PE and preterm PE, respectively). Conclusions: Our models provide an accessible tool for large-scale population screening and prediction of preeclampsia, which helps reduce the disease burden and improve maternal and fetal outcomes.
endocrinology & metabolism
What problem does this paper attempt to address?