Abstract:Existing conformal prediction algorithms estimate prediction intervals at target confidence levels to characterize the performance of a regression model on new test samples. However, considering an autonomous system consisting of multiple modules, prediction intervals constructed for individual modules fall short of accommodating uncertainty propagation over different modules and thus cannot provide reliable predictions on system behavior. We address this limitation and present novel solutions based on conformal prediction to provide prediction intervals calibrated for a predictive system consisting of cascaded modules (e.g., an upstream feature extraction module and a downstream regression module). Our key idea is to leverage module-level validation data to characterize the system-level error distribution without direct access to end-to-end validation data. We provide theoretical justification and empirical experimental results to demonstrate the effectiveness of proposed solutions. In comparison to prediction intervals calibrated for individual modules, our solutions generate improved intervals with more accurate performance guarantees for system predictions, which are demonstrated on both synthetic systems and real-world systems performing overlap prediction for indoor navigation using the Matterport3D dataset.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to provide reliable performance guarantees in a system composed of multiple prediction modules. Specifically, the existing conformance prediction algorithms can estimate prediction intervals for a single module to characterize the performance of the regression model on new test samples. However, in an autonomous system composed of multiple modules, the prediction intervals constructed separately for each module cannot adapt to the uncertainty propagation between different modules, and thus cannot provide reliable predictions for system behavior. The paper proposes a new solution to provide calibrated prediction intervals for prediction systems with cascaded prediction modules (such as upstream feature extraction modules and downstream regression modules) based on conformance prediction. The key idea is to use module - level validation data to characterize the system - level error distribution without directly accessing end - to - end validation data.
### Main Contributions
1. **Identify and solve the key problem of providing reliable performance guarantees**: For systems with cascaded modules, a new calibration solution is proposed, allowing system users to make safe and informative system behavior predictions without the need for system - level validation data.
2. **Utilize module - level validation data**: A similarity - based calibration method is proposed, which uses a small - scale module - level validation data set to estimate the upper bound and empirical quantiles of the target system - level error.
3. **Experimental verification**: Through the experimental results of synthetic data and real - world data, it is proved that the module - level algorithm is insufficient to characterize the system - level uncertainty, and the proposed system - level solution provides improved prediction intervals, and its empirical coverage is more consistent with the target level.
### Method Overview
#### 3.1 End - to - End System - Level Calibration
Assume that at least a small portion of system - level supervised data is available for calibration, and this method can generate appropriate performance predictions. However, this assumption is not very realistic in practical applications because additional system - level data may be expensive or difficult to collect at all.
#### 3.2 System - Level Calibration Using Module - Level Data
Using only the training distribution data at the module level, an algorithm is proposed to estimate the test distribution of the system - level prediction error \( S \). The specific steps are as follows:
- Based on the upstream validation data \( D_f \), calculate the upstream prediction error \( U_i \):
\[
U_i=|(\hat{g} \circ \hat{f})(X_i)-(\hat{g} \circ f)(X_i)| = |\hat{g}(\hat{f}(X_i))-\hat{g}(Y_i)|
\]
- Based on the downstream validation data \( D_g \), calculate the downstream prediction error \( W_j \):
\[
W_j = |(\hat{g} \circ f)(X_j)-(g \circ f)(X_j)|=|\hat{g}(Y_j)-Z_j|
\]
- Using the triangle inequality, obtain the upper bound of the system - level error \( S_i \):
\[
S_i = |(\hat{g} \circ \hat{f})(X_i)-(g \circ f)(X_i)|\leq U_i + W_i
\]
- Calculate the empirical quantiles of the module - level errors \( U \) and \( W \), and estimate the upper bound of the target system - level error \( S \):
\[
\hat{Q}_{\alpha, D_f, D_g}=\min_{\beta\in[\alpha, 1]}[Q_{\beta, D_f}(U)+Q_{1 - \beta+\alpha, D_g}(W)]
\]
#### 3.3 System - Level Calibration Based on Clustering
To further improve the system - level prediction interval, a method is proposed to use the relative position of samples in the module - level data distribution, thereby relaxing the constraint on the fixed interval width for all samples. The specific steps are as follows:
- Perform K - means clustering on the module - level validation data \( D_f \) and \( D_g \).
- Based on the Euclidean distance of the cluster centers, form a clustering correspondence between the two module - level validation data sets.
- For a new test sample \( X_