Adjusting Regression Models for Conditional Uncertainty Calibration

Ruijiang Gao,Mingzhang Yin,James McInerney,Nathan Kallus
2024-09-26
Abstract:Conformal Prediction methods have finite-sample distribution-free marginal coverage guarantees. However, they generally do not offer conditional coverage guarantees, which can be important for high-stakes decisions. In this paper, we propose a novel algorithm to train a regression function to improve the conditional coverage after applying the split conformal prediction procedure. We establish an upper bound for the miscoverage gap between the conditional coverage and the nominal coverage rate and propose an end-to-end algorithm to control this upper bound. We demonstrate the efficacy of our method empirically on synthetic and real-world datasets.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve conditional coverage by optimizing the prediction function in regression tasks, in order to achieve more accurate uncertainty calibration. Specifically, the paper proposes a new algorithm, aiming to train a regression function to improve conditional coverage after applying the split conformal prediction procedure. Although traditional conformal prediction methods can provide guarantees for marginal coverage, they usually cannot provide guarantees for conditional coverage. This is especially important in high - risk decision - making tasks, because marginal coverage cannot guarantee effective coverage for specific subgroups, especially for rare events or minorities. ### Main contributions: 1. **Propose a new method**: Optimize the prediction function by minimizing the Kolmogorov - Smirnov (KS) distance between the marginal non - conformity score distribution and the conditional non - conformity score distribution, thereby improving conditional coverage. 2. **Theoretical connection**: Establish a theoretical connection between the proposed KS regularization and the conditional coverage objective. 3. **Empirical verification**: Use synthetic data and real - world datasets to verify the effectiveness and advantages of the proposed method. ### Background and motivation: - **Supervised machine learning**: The core challenge is to predict the target variable \(Y\) based on the input vector \(\mathbf{X}\), usually by constructing a prediction function \(f(Y|\mathbf{X})\). - **Conformal prediction**: Provide calibrated coverage probabilities to ensure that the prediction set contains the probability of unobserved targets. Traditional methods can only guarantee marginal coverage, not conditional coverage. - **Importance of conditional coverage**: In high - risk decision - making tasks, conditional coverage is more important than marginal coverage because it can ensure effective coverage for specific subgroups. ### Method overview: - **Non - conformity score function**: Define a non - conformity score function \(V(\mathbf{X}, Y)\) to measure the deviation between the predicted value and the true value. - **Split conformal prediction framework**: Divide the dataset into a training set and a calibration set, train the prediction function and calculate the non - conformity score on the calibration set. - **Optimization objective**: Optimize the prediction function by minimizing the KS distance between the marginal non - conformity score distribution and the conditional non - conformity score distribution. The specific objective function is: \[ \min_{\theta} \mathbb{E}[(y - f_{\theta}(\mathbf{x}))^2]+\lambda \sup_{\mathbf{x}} \text{KS}(P(V|\mathbf{X}=\mathbf{x}), P(V)) \] where \(\lambda\) is a hyperparameter that balances the mean square error and the distance constraint. ### Experimental results: - **Synthetic data**: It is shown that under different settings, the proposed method can significantly improve conditional coverage, especially in some small subgroups. - **Real data**: Verified using 6 UCI datasets, and the results show that the proposed method can improve the worst - subgroup - left - at - best (WSLAB) coverage on different non - conformity scores and datasets. ### Conclusion: The method proposed in this paper effectively improves conditional coverage by optimizing the prediction function, which is especially significant in high - risk decision - making tasks. Experiments prove that the proposed method shows superior performance on both synthetic data and real - world datasets.