Deep Conformal Supervision: Leveraging Intermediate Features for Robust Uncertainty Quantification

Amir M Vahdani,Shahriar Faghani
DOI: https://doi.org/10.1007/s10278-024-01286-5
2024-10-07
Abstract:Trustworthiness is crucial for artificial intelligence (AI) models in clinical settings, and a fundamental aspect of trustworthy AI is uncertainty quantification (UQ). Conformal prediction as a robust uncertainty quantification (UQ) framework has been receiving increasing attention as a valuable tool in improving model trustworthiness. An area of active research is the method of non-conformity score calculation for conformal prediction. We propose deep conformal supervision (DCS), which leverages the intermediate outputs of deep supervision for non-conformity score calculation, via weighted averaging based on the inverse of mean calibration error for each stage. We benchmarked our method on two publicly available datasets focused on medical image classification: a pneumonia chest radiography dataset and a preprocessed version of the 2019 RSNA Intracranial Hemorrhage dataset. Our method achieved mean coverage errors of 16e-4 (CI: 1e-4, 41e-4) and 5e-4 (CI: 1e-4, 10e-4) compared to baseline mean coverage errors of 28e-4 (CI: 2e-4, 64e-4) and 21e-4 (CI: 8e-4, 3e-4) on the two datasets, respectively (p < 0.001 on both datasets). Based on our findings, the baseline results of conformal prediction already exhibit small coverage errors. However, our method shows a significant improvement on coverage error, particularly noticeable in scenarios involving smaller datasets or when considering smaller acceptable error levels, which are crucial in developing UQ frameworks for healthcare AI applications.
What problem does this paper attempt to address?