Abstract:Purpose: To fine tune and evaluate the performance of the retinal foundation model (RETFound) on a diverse longitudinal clinical research dataset in glaucoma detection from optical coherence tomography (OCT) RNFL scans. Subanalyses of the model performance were evaluated across different subgroups, various dataset sample sizes and training cycles (epochs). Design: Evaluation of a diagnostic technology Subjects, Participants, and Controls: 15,216 Spectralis OCT RNFL circle scans of 747 individuals of diverse race (56.9% White, 37.8% Black / African American, and 5.3% Other / Not reported (5.3%), glaucoma severity (30.8% mild, 18.4% moderate-to-severe, and 50.9% no glaucoma), and age (44.8% <60 years, 55.2% >60 years) from the Diagnostic Innovations in Glaucoma Study (DIGS) and the African Descent and Glaucoma Evaluation Study (ADAGES). All OCT b scans were labeled as "Non-glaucomatous" or "Glaucomatous." Methods: RETFound was employed to perform binary glaucoma classification. The diagnostic accuracy of RETFound was iteratively tested across different combinations of dataset sample sizes (50 to 2000 OCT RNFL circle scans), epochs (5 to 50), and study subpopulations stratified by severity of glaucoma, age, and race). Main Outcome Measures: Area under receiver operating characteristic curve (AUC) for classifying RNFL scans as "Non-glaucomatous" or "Glaucomatous." Results: Performance metrics improved with larger training datasets and more training cycles, rising from an AUC of 0.61 (50 training images and 5 epochs) to AUC 0.91 (2,000 training images and 50 epochs). Gains in performance were marginal as training size increased beyond 500 scans. Performance was similar across race for all training size and cycle number combinations: African American (AUC=0.90) vs other (AUC=0.93). RNFL scans from older patients (>60 years) led to worse performance (AUC=0.85) compared to younger patients (<60 years, AUC=0.95), Performance was significantly higher for RNFL scans from patients with moderate-to-severe glaucoma vs mild glaucoma (AUC=0.99 vs 0.88, respectively). Conclusions: Good RETFound performance was observed with a relatively small sample size of images used for fine tuning and across differences in race and age. The ability of RETFound to adapt across a range of OCT training conditions and populations suggests it is a promising tool to automate glaucoma detection in a variety of use cases.

RET-CLIP: A Retinal Image Foundation Model Pre-trained with Clinical Diagnostic Reports

VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model for Generalizable Retina Image Analysis

EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis

Common and Rare Fundus Diseases Identification Using Vision-Language Foundation Model with Knowledge of Over 400 Diseases

A foundation model for generalizable disease detection from retinal images

UrFound: Towards Universal Retinal Foundation Models via Knowledge-Guided Masked Modeling

ViLReF: An Expert Knowledge Enabled Vision-Language Retinal Foundation Model

EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

RemoteCLIP: A Vision Language Foundation Model for Remote Sensing

Fine-Tuning SSL-Model to Enhance Detection of Cilioretinal Arteries on Colored Fundus Images

Accuracy of a New Foundation Model in Glaucoma Detection using Ocular Coherence Tomography Images

RetiGen: A Framework for Generalized Retinal Diagnosis Using Multi-View Fundus Images

Interpretable Detection of Diabetic Retinopathy, Retinal Vein Occlusion, Age-Related Macular Degeneration, and Other Fundus Conditions

A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks

CLIP in Medical Imaging: A Comprehensive Survey

Fundus-Enhanced Disease-Aware Distillation Model for Retinal Disease Classification from OCT Images

Training a high-performance retinal foundation model with half-the-data and 400 times less compute

A Foundation LAnguage-Image model of the Retina (FLAIR): Encoding expert knowledge in text supervision

RETFound-enhanced community-based fundus disease screening: real-world evidence and decision curve analysis

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents