Testing the generalizability and effectiveness of deep learning models among clinics: sperm detection as a pilot study

Jiaqi Wang,Yufei Jin,Aojun Jiang,Wenyuan Chen,Guanqiao Shan,Yifan Gu,Yue Ming,Jichang Li,Chunfeng Yue,Zongjie Huang,Clifford Librach,Ge Lin,Xibu Wang,Huan Zhao,Yu Sun,Zhuoran Zhang
DOI: https://doi.org/10.1186/s12958-024-01232-8
2024-05-24
Reproductive Biology and Endocrinology
Abstract:Deep learning has been increasingly investigated for assisting clinical in vitro fertilization (IVF). The first technical step in many tasks is to visually detect and locate sperm, oocytes, and embryos in images. For clinical deployment of such deep learning models, different clinics use different image acquisition hardware and different sample preprocessing protocols, raising the concern over whether the reported accuracy of a deep learning model by one clinic could be reproduced in another clinic. Here we aim to investigate the effect of each imaging factor on the generalizability of object detection models, using sperm analysis as a pilot example.
endocrinology & metabolism,reproductive biology
What problem does this paper attempt to address?