Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration

Shi-ang Qi,Yakun Yu,Russell Greiner
2024-06-03
Abstract:Discrimination and calibration represent two important properties of survival analysis, with the former assessing the model's ability to accurately rank subjects and the latter evaluating the alignment of predicted outcomes with actual events. With their distinct nature, it is hard for survival models to simultaneously optimize both of them especially as many previous results found improving calibration tends to diminish discrimination performance. This paper introduces a novel approach utilizing conformal regression that can improve a model's calibration without degrading discrimination. We provide theoretical guarantees for the above claim, and rigorously validate the efficiency of our approach across 11 real-world datasets, showcasing its practical applicability and robustness in diverse scenarios.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenge of simultaneously optimizing the discrimination performance and calibration performance of the model in survival analysis. Specifically: 1. **Trade - off between discrimination performance and calibration performance**: Many existing survival analysis models often sacrifice calibration performance (i.e., the consistency between the probability distribution predicted by the model and the actual observed results) when optimizing discrimination performance (i.e., the model's ability to rank the risk levels of patients). This trade - off limits the effectiveness of the model in practical applications, especially in medical decision - making scenarios that require high reliability. 2. **Limitations of existing methods**: In order to improve calibration performance, some studies have attempted to add calibration - related loss functions during the model training process. However, this method faces many challenges in the training stage, such as increased difficulty in model convergence and the need to adjust more hyperparameters. More importantly, these methods often lead to a decline in discrimination performance, which is unacceptable in application scenarios where discrimination accuracy is equally important. To solve the above problems, this paper proposes the **Conformalized Survival Distribution (CSD)** framework, which is a plug - in post - processing method aimed at enhancing the calibration performance of survival distribution models without compromising their discrimination performance. The main contributions of the CSD framework include: - **Adaptability**: CSD is a model - independent framework and can be applied to any statistical or machine - learning survival model capable of predicting individual survival curves (ISD). - **Theoretical guarantee**: The paper provides a theoretical guarantee, proving that CSD can not only improve calibration performance but also maintain discrimination performance. - **Empirical verification**: The effectiveness and robustness of CSD have been verified through extensive experiments on 11 real - world datasets. - **Connection of calibration metrics**: The paper proves that minimizing distribution calibration (D - cal) is asymptotically equivalent to minimizing integral single - point calibration (1 - cal), thus connecting the two main calibration metrics. In conclusion, by proposing the CSD framework, this paper provides a new solution that can significantly improve the calibration performance of survival analysis models without sacrificing discrimination performance, thereby improving the reliability and effectiveness of the model in practical applications.