Advancing Glaucoma Diagnosis: Employing Confidence-Calibrated Label Smoothing Loss for Model Calibration

Midhula Vijayan,Deepthi Keshav Prasad,Venkatakrishnan Srinivasan
DOI: https://doi.org/10.1016/j.xops.2024.100555
2024-06-22
Abstract:Objective: The aim of our research is to enhance the calibration of machine learning models for glaucoma classification through a specialized loss function named Confidence-Calibrated Label Smoothing (CC-LS) loss. This approach is specifically designed to refine model calibration without compromising accuracy by integrating label smoothing and confidence penalty techniques, tailored to the specifics of glaucoma detection. Design: This study focuses on the development and evaluation of a calibrated deep learning model. Participants: The study employs fundus images from both external datasets-the Online Retinal Fundus Image Database for Glaucoma Analysis and Research (482 normal, 168 glaucoma) and the Retinal Fundus Glaucoma Challenge (720 normal, 80 glaucoma)-and an extensive internal dataset (4639 images per category), aiming to bolster the model's generalizability. The model's clinical performance is validated using a comprehensive test set (47 913 normal, 1629 glaucoma) from the internal dataset. Methods: The CC-LS loss function seamlessly integrates label smoothing, which tempers extreme predictions to avoid overfitting, with confidence-based penalties. These penalties deter the model from expressing undue confidence in incorrect classifications. Our study aims at training models using the CC-LS and comparing their performance with those trained using conventional loss functions. Main outcome measures: The model's precision is evaluated using metrics like the Brier score, sensitivity, specificity, and the false positive rate, alongside qualitative heatmap analyses for a holistic accuracy assessment. Results: Preliminary findings reveal that models employing the CC-LS mechanism exhibit superior calibration metrics, as evidenced by a Brier score of 0.098, along with notable accuracy measures: sensitivity of 81%, specificity of 80%, and weighted accuracy of 80%. Importantly, these enhancements in calibration are achieved without sacrificing classification accuracy. Conclusions: The CC-LS loss function presents a significant advancement in the pursuit of deploying machine learning models for glaucoma diagnosis. By improving calibration, the CC-LS ensures that clinicians can interpret and trust the predictive probabilities, making artificial intelligence-driven diagnostic tools more clinically viable. From a clinical standpoint, this heightened trust and interpretability can potentially lead to more timely and appropriate interventions, thereby optimizing patient outcomes and safety. Financial disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
What problem does this paper attempt to address?