Conformal Prediction for Deep Classifier via Label Ranking

Jianguo Huang,Huajun Xi,Linjun Zhang,Huaxiu Yao,Yue Qiu,Hongxin Wei
2024-06-06
Abstract:Conformal prediction is a statistical framework that generates prediction sets containing ground-truth labels with a desired coverage guarantee. The predicted probabilities produced by machine learning models are generally miscalibrated, leading to large prediction sets in conformal prediction. To address this issue, we propose a novel algorithm named $\textit{Sorted Adaptive Prediction Sets}$ (SAPS), which discards all the probability values except for the maximum softmax probability. The key idea behind SAPS is to minimize the dependence of the non-conformity score on the probability values while retaining the uncertainty information. In this manner, SAPS can produce compact prediction sets and communicate instance-wise uncertainty. Extensive experiments validate that SAPS not only lessens the prediction sets but also broadly enhances the conditional coverage rate of prediction sets.
Machine Learning,Computer Vision and Pattern Recognition,Statistics Theory
What problem does this paper attempt to address?
### The problems the paper attempts to solve This paper aims to solve the problem of how to generate more compact prediction sets with high conditional coverage when using deep classifiers for prediction. Specifically, the paper focuses on the fact that in the Conformal Prediction (CP) framework, the prediction probabilities generated by machine - learning models are usually uncalibrated, resulting in overly large prediction sets. To alleviate this problem, the authors propose a new algorithm - Sorted Adaptive Prediction Sets (SAPS). ### Main contributions 1. **Redundancy of probability values**: - Through empirical analysis, the authors find that probability values may be unnecessary information in Adaptive Prediction Sets (APS). After removing the probability values, APS can generate smaller prediction sets, and the size of the prediction sets is consistent with the prediction accuracy of the model. 2. **Proposing a new non - conformity score**: - The SAPS algorithm is proposed. This algorithm minimizes the dependence of the non - conformity score on probability values while retaining uncertainty information. SAPS only retains the maximum softmax probability value, and the remaining probability values are replaced with constants. 3. **Experimental verification**: - Through extensive experiments on multiple benchmark datasets (such as CIFAR - 10, CIFAR - 100, and ImageNet), the effectiveness of SAPS is verified. The experimental results show that SAPS not only reduces the average size of prediction sets but also significantly improves conditional coverage. 4. **Theoretical analysis**: - Theoretical analysis is provided to further explain the performance improvement mechanism of APS after removing probability values. In particular, the authors prove that APS after removing probability values can generate prediction sets that are consistent with the prediction accuracy of the model. ### Method overview 1. **Motivation analysis**: - The authors conduct an ablation study by removing the influence of probability values. The experimental results show that the prediction sets generated by APS based on label ranking are smaller than those of traditional APS, indicating that probability values may be redundant in the non - conformity score. 2. **SAPS algorithm**: - The core idea of the SAPS algorithm is to minimize the dependence of the non - conformity score on probability values while retaining uncertainty information. Specifically, SAPS only retains the maximum softmax probability value, and the remaining probability values are replaced with constants. This can generate more compact prediction sets and better reflect the uncertainty of instances. 3. **Experimental setup**: - Experiments are carried out on multiple datasets, including ImageNet, CIFAR - 10, and CIFAR - 100. Multiple pre - trained models are used, and the models are calibrated through Temperature Scaling. The experimental results show that SAPS performs excellently on different datasets and models. ### Conclusion By removing the influence of probability values, the SAPS algorithm can generate more compact prediction sets with high conditional coverage. Both experimental results and theoretical analysis support this conclusion, indicating the effectiveness and superiority of SAPS in conformal prediction.