Artificial Intelligence for Retinopathy of Prematurity
J. Peter Campbell,Michael F. Chiang,Jimmy S. Chen,Darius M. Moshfeghi,Eric Nudleman,Paisan Ruambivoonsuk,Hunter Cherwek,Carol Y. Cheung,Praveer Singh,Jayashree Kalpathy-Cramer,Susan Ostmo,Malvina Eydelman,R.V. Paul Chan,Antonio Capone,Audina Berrocal,Gil Binenbaum,Michael Blair,J. Peter Campbell,Antonio Capone,R.V. Paul Chan,Yi Chen,Michael F. Chiang,Shuan Dai,Anna Ells,Alistair Fielder,Brian Fleck,William Good,Mary Elizabeth Hartnett,Gerd Holmstrom,Shunji Kusaka,Andres Kychenthal,Domenico Lepore,Birgit Lorenz,Maria Ana Martinez-Castellanos,Sengul Ozdek,Dupe Popoola,Graham Quinn,James Reynolds,Parag Shah,Michael Shapiro,Andreas Stahl,Cynthia Toth,Anand Vinekar,Linda Visser,David Wallace,Wei-Chi Wu,Peiquan Zhao,Andrea Zin,M.Ichael Abramoff,Mark Blumenkranz,Malvina Eydelman,David Myung,Joel S. Schuman,Carol Shields,Aaron Lee,Michael Repka,Michael F. Chiang,J. Peter Campbell,Darius M. Moshfeghi,Eric Nudleman,Paisan Ruamviboonsuk,D. Hunter Cherwek,Carol Y. Cheung,R.V. Paul Chan,Antonio Capone
DOI: https://doi.org/10.1016/j.ophtha.2022.02.008
IF: 14.277
2022-02-01
Ophthalmology
Abstract:PURPOSE: To validate a vascular severity score as an appropriate output for artificial intelligence (AI) Software as a Medical Device (SaMD) for retinopathy of prematurity (ROP) through comparison with ordinal disease severity labels for stage and plus disease assigned by the International Classification of Retinopathy of Prematurity, Third Edition (ICROP3), committee.DESIGN: Validation study of an AI-based ROP vascular severity score.PARTICIPANTS: A total of 34 ROP experts from the ICROP3 committee.METHODS: Two separate datasets of 30 fundus photographs each for stage (0-5) and plus disease (plus, preplus, neither) were labeled by members of the ICROP3 committee using an open-source platform. Averaging these results produced a continuous label for plus (1-9) and stage (1-3) for each image. Experts were also asked to compare each image to each other in terms of relative severity for plus disease. Each image was also labeled with a vascular severity score from the Imaging and Informatics in ROP deep learning system, which was compared with each grader's diagnostic labels for correlation, as well as the ophthalmoscopic diagnosis of stage.MAIN OUTCOME MEASURES: Weighted kappa and Pearson correlation coefficients (CCs) were calculated between each pair of grader classification labels for stage and plus disease. The Elo algorithm was also used to convert pairwise comparisons for each expert into an ordered set of images from least to most severe.RESULTS: The mean weighted kappa and CC for all interobserver pairs for plus disease image comparison were 0.67 and 0.88, respectively. The vascular severity score was found to be highly correlated with both the average plus disease classification (CC = 0.90, P < 0.001) and the ophthalmoscopic diagnosis of stage (P < 0.001 by analysis of variance) among all experts.CONCLUSIONS: The ROP vascular severity score correlates well with the International Classification of Retinopathy of Prematurity committee member's labels for plus disease and stage, which had significant intergrader variability. Generation of a consensus for a validated scoring system for ROP SaMD can facilitate global innovation and regulatory authorization of these technologies.
ophthalmology