Computer Vision Identification of Trachomatous Inflammation-Follicular Using Deep Learning

Ashlin S Joye,Marissa G Firlie,Dionna M Wittberg,Solomon Aragie,Scott D Nash,Zerihun Tadesse,Adane Dagnew,Dagnachew Hailu,Fisseha Admassu,Bilen Wondimteka,Habib Getachew,Endale Kabtu,Social Beyecha,Meskerem Shibiru,Banchalem Getnet,Tibebe Birhanu,Seid Abdu,Solomon Tekew,Thomas M Lietman,Jeremy D Keenan,Travis K Redd
DOI: https://doi.org/10.1097/ICO.0000000000003701
IF: 3.152
2024-09-20
Cornea
Abstract:Purpose: Trachoma surveys are used to estimate the prevalence of trachomatous inflammation-follicular (TF) to guide mass antibiotic distribution. These surveys currently rely on human graders, introducing a significant resource burden and potential for human error. This study describes the development and evaluation of machine learning models intended to reduce cost and improve reliability of these surveys. Methods: Fifty-six thousand seven hundred twenty-five everted eyelid photographs were obtained from 11,358 children of age 0 to 9 years in a single trachoma-endemic region of Ethiopia over a 3-year period. Expert graders reviewed all images from each examination to determine the estimated number of tarsal conjunctival follicles and the degree of trachomatous inflammation-intense. The median estimate of the 3 grader groups was used as the ground truth to train a MobileNetV3 large deep convolutional neural network to detect cases with TF. Results: The classification model predicted a TF prevalence of 32%, which was not significantly different from the human consensus estimate (30%; 95% confidence interval of difference, -2 to +4%). The model had an area under the receiver operating characteristic curve of 0.943, F1 score of 0.923, 88% accuracy, 83% sensitivity, and 91% specificity. The area under the receiver operating characteristic curve increased to 0.995 when interpreting nonborderline cases of TF. Conclusions: Deep convolutional neural network models performed well at classifying TF and detecting the number of follicles evident in conjunctival photographs. Implementation of similar models may enable accurate, efficient, large-scale trachoma screening. Further validation in diverse populations with varying TF prevalence is needed before implementation at scale.
What problem does this paper attempt to address?