Constructing small for gestational age prediction models: A retrospective machine learning study

Xinyu Chen,Siqing Wu,Xinqing Chen,Linmin Hu,Wenjing Li,Ningning Mi,Peng Xie,Yujun Huang,Kun Yuan,Yajuan Sui,Renjie Li,Kangting Wang,Nan Sun,Yuyang Yao,Zuofeng Xu,Jinqiu Yuan,Yunxiao Zhu
DOI: https://doi.org/10.1016/j.ejogrb.2024.11.022
IF: 2.831
2024-11-28
European Journal of Obstetrics & Gynecology and Reproductive Biology
Abstract:Objective To develop machine learning prediction models for small for gestational age with baseline characteristics and biochemical tests of various pregnancy stages individually and collectively and compare predictive performance. Study design This retrospective study included singleton pregnancies with infants born between May 2018 and March 2023. Small for gestational age was defined as a birth weight below the 10th percentile according to the Intergrowth-21st fetal growth standards. The pregnancy data were categorized into four datasets at different gestational time points (14 and 28 weeks and admission). The LightGBM framework was utilized to assess the variable importance by employing a five-fold cross-validation. RandomizedSearchCV and sequential feature selection were applied to estimate the optimal number of features. Seven machine learning algorithms were used to develop prediction models, with an 8:2 ratio for training and testing. The model performance was evaluated using receiver operating characteristic curve analysis and sensitivity at a false positive rate of 10 %. Results We included data of 4,394 women with singleton pregnancies, including 148 (3.4%) small for gestational age infants. Women delivering small for gestational age infants exhibited significantly shorter stature and lower fundal height and abdominal circumference at admission. Maternal height, age, and pre-pregnancy weight consistently ranked among the top 20 features in prediction models with any dataset. The models incorporated variables of admission stage have strong predictive performance with the area under the curves exceeding 0.8. The prediction model developed with variables of admission stage yielded the best performance, achieving an area under the curve of 0.85 and a sensitivity of 73% at the false positive rate of 10%. Conclusions By machine learning, various pregnancy stages' prediction models for small for gestational age showed good predictive performance, and the predictive value of variables at each pregnancy stage was fully explored. The prediction model with the best performance was established with variables of admission stage and emphasized the significance of prenatal physical examinations.
obstetrics & gynecology,reproductive biology
What problem does this paper attempt to address?