Development of a deep learning-based image eligibility verification system for detecting and filtering out ineligible fundus images: A multicentre study

Zhongwen Li,Jiewei Jiang,Heding Zhou,Qinxiang Zheng,Xiaotian Liu,Kuan Chen,Hongfei Weng,Wei Chen
DOI: https://doi.org/10.1016/j.ijmedinf.2020.104363
IF: 4.73
2021-03-01
International Journal of Medical Informatics
Abstract:<h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Background</h3><p>Recent advances in artificial intelligence (AI) have shown great promise in detecting some diseases based on medical images. Most studies developed AI diagnostic systems only using eligible images. However, in real-world settings, ineligible images (including poor-quality and poor-location images) that can compromise downstream analysis are inevitable, leading to uncertainty about the performance of these AI systems. This study aims to develop a deep learning-based image eligibility verification system (DLIEVS) for detecting and filtering out ineligible fundus images.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Methods</h3><p>A total of 18,031 fundus images (9,188 subjects) collected from 4 clinical centres were used to develop and evaluate the DLIEVS for detecting eligible, poor-location, and poor-quality fundus images. Four deep learning algorithms (AlexNet, DenseNet121, Inception V3, and ResNet50) were leveraged to train models to obtain the best model for the DLIEVS. The performance of the DLIEVS was evaluated using the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity, as compared with a reference standard determined by retina experts.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Results</h3><p>In the internal test dataset, the best algorithm (DenseNet121) achieved AUCs of 1.000, 0.999, and 1.000 for the classification of eligible, poor-location, and poor-quality images, respectively. In the external test datasets, the AUCs of the best algorithm (DenseNet121) for detecting eligible, poor-location, and poor-quality images were ranged from 0.999 to 1.000, 0.997 to 1.000, and 0.997 to 0.999, respectively.</p><h3 class="u-h4 u-margin-m-top u-margin-xs-bottom">Conclusions</h3><p>Our DLIEVS can accurately discriminate poor-quality and poor-location images from eligible images. This system has the potential to serve as a pre-screening technique to filter out ineligible images obtained from real-world settings, ensuring only eligible images will be applied in the subsequent image-based AI diagnostic analyses.</p>
health care sciences & services,computer science, information systems,medical informatics
What problem does this paper attempt to address?