Batch Multivalid Conformal Prediction

Christopher Jung,Georgy Noarov,Ramya Ramalingam,Aaron Roth
2022-09-30
Abstract:We develop fast distribution-free conformal prediction algorithms for obtaining multivalid coverage on exchangeable data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees in two ways: (1) They hold even conditional on group membership -- that is, the target coverage level $1-\alpha$ holds conditionally on membership in each of an arbitrary (potentially intersecting) group in a finite collection $\mathcal{G}$ of regions in the feature space. (2) They hold even conditional on the value of the threshold used to produce the prediction set on a given example. In fact multivalid coverage guarantees hold even when conditioning on group membership and threshold value simultaneously. We give two algorithms: both take as input an arbitrary non-conformity score and an arbitrary collection of possibly intersecting groups $\mathcal{G}$, and then can equip arbitrary black-box predictors with prediction sets. Our first algorithm (BatchGCP) is a direct extension of quantile regression, needs to solve only a single convex minimization problem, and produces an estimator which has group-conditional guarantees for each group in $\mathcal{G}$. Our second algorithm (BatchMVP) is iterative, and gives the full guarantees of multivalid conformal prediction: prediction sets that are valid conditionally both on group membership and non-conformity threshold. We evaluate the performance of both of our algorithms in an extensive set of experiments. Code to replicate all of our experiments can be found at <a class="link-external link-https" href="https://github.com/ProgBelarus/BatchMultivalidConformal" rel="external noopener nofollow">this https URL</a>
Machine Learning,Statistics Theory
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to obtain multivalid coverage guarantees on the exchanged data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees, which are specifically reflected in two aspects: 1. **Conditioned on group members**: Even when conditioned on group members, the target coverage rate \(1 - \alpha\) still holds. That is, for each region in any finite set \(G\) in the feature space, the target coverage rate \(1 - \alpha\) still holds when conditioned on the members belonging to these regions. 2. **Conditioned on thresholds**: Even when conditioned on the thresholds used to generate prediction sets, the target coverage rate \(1 - \alpha\) still holds. In fact, multivalid coverage guarantees still hold even when simultaneously conditioned on group members and thresholds. To achieve this goal, the paper proposes two algorithms: 1. **BatchGCP**: This is a method that directly extends quantile regression. It only needs to solve a convex optimization problem to generate prediction sets with group - conditional guarantees. 2. **BatchMVP**: This is an iterative algorithm that can provide complete multivalid coverage guarantees, that is, the prediction sets are valid when conditioned on group members and non - conformity thresholds. Both of these algorithms can combine any black - box predictor with prediction sets, thereby providing stronger coverage guarantees in the batch setting. The paper evaluates the performance of these two algorithms through extensive experiments and provides code to reproduce the experimental results.