Supporting first FSH dosage for ovarian stimulation with Machine Learning

N. Correa,J. Cerquides,J.L. Arcos,R. Vassena
DOI: https://doi.org/10.1101/2022.02.10.22270790
2022-02-15
Abstract:Abstract Research question Is it possible to identify accurately the optimal first dose of FSH in controlled ovarian stimulation (COS) by means of a machine learning (ML) model? Design Observational study (2011 to 2021) including first In Vitro Fertilization (IVF) cycles with own oocytes. 2713 patients from five private reproductive centers were included in the development phase of the model (2011 to 2019), and 774 in the validation phase (2020 to 2021). Predictor variables included: age, Body Mass Index (BMI), Antimullerian Hormone (AMH), Antral Follicle Count (AFC), and previous live births. Performance of the developed model was measured with a proposed score based on the number of MII retrieved and the dose received and/or recommended. Results The cycles included were from women 37.2±4.9 years old [18-45], with a BMI of 23.7±4.2, AMH of 2.4±2.3, AFC of 11.8±7.7; and an average number of MII obtained 7.2±5.3. The model reached a mean performance score of 0.87 (95% CI 0.86 to 0.88) in the development phase; this value was significantly better than the one for the doses prescribed by the clinicians for the same patients (0.83 [0.82, 0.84]; p-value= 2.44 e-10). The mean performance score of the model recommendations was 0.89 (95% CI 0.88 to 0.90) in the validation phase, also significantly better than clinicians (0.84 [0.82, 0.86]; p-value = 3.81 e-05). With these results the model was shown to surpass the performance of the standard practice. Conclusion(s) The ML model developed could be deployed as a training and learning tool for new clinicians and serve as quality control for experienced ones; further, it could be used as second opinion, for instance by providing information in peer-to-peer case discussions. Key Message A Machine Learning model was trained to recommend first FSH doses for ovarian stimulation. When compared to clinicians the model developed had consistently better performance scores. The model could be used as a second opinion and as learning tool for new clinicians; to avoid as many non-optimal outcomes as possible.
What problem does this paper attempt to address?