Density-Calibrated Conformal Quantile Regression

Yuan Lu
2024-11-29
Abstract:This paper introduces the Density-Calibrated Conformal Quantile Regression (CQR-d) method, a novel approach for constructing prediction intervals that adapts to varying uncertainty across the feature space. Building upon conformal quantile regression, CQR-d incorporates local information through a weighted combination of local and global conformity scores, where the weights are determined by local data density. We prove that CQR-d provides valid marginal coverage at level $1 - \alpha - \epsilon$, where $\epsilon$ represents a small tolerance from numerical optimization. Through extensive simulation studies and an application to the a heteroscedastic dataset available in R, we demonstrate that CQR-d maintains the desired coverage while producing substantially narrower prediction intervals compared to standard conformal quantile regression (CQR). Notably, in our application on heteroscedastic data, CQR-d achieves an $8.6\%$ reduction in average interval width while maintaining comparable coverage. The method's effectiveness is particularly pronounced in settings with clear local uncertainty patterns, making it a valuable tool for prediction tasks in heterogeneous data environments.
Methodology,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to construct effective prediction intervals in the presence of heteroscedasticity and complex non - linear relationships. Traditional methods have difficulty maintaining reliable coverage when dealing with different data distributions, especially when there is a large change in data uncertainty. For this reason, the author proposes the Density - Calibrated Conformal Quantile Regression (CQR - d) method. ### Specific Problems and Solutions 1. **Heteroscedasticity and Complex Non - linear Relationships** - **Problem**: Traditional methods perform poorly when dealing with heteroscedasticity (i.e., the error variance is different under different inputs) and complex non - linear relationships, resulting in unreliable coverage of prediction intervals. - **Solution**: The CQR - d method adapts to the uncertainty in different regions by combining local and global conformity scores and adjusting weights according to local data density. 2. **Trade - off between the Width of Prediction Intervals and Coverage** - **Problem**: Traditional prediction interval methods usually produce overly wide intervals while ensuring coverage, reducing the prediction accuracy. - **Solution**: CQR - d can significantly reduce the width of prediction intervals while maintaining the required coverage by introducing a local adaptation mechanism. For example, in the application on heteroscedastic data sets, CQR - d achieved an average interval width reduction of 8.6% while maintaining a similar coverage. 3. **Local Uncertainty Patterns in High - Dimensional Data** - **Problem**: In a high - dimensional data environment, the uncertainty differences among different sub - groups are large, and traditional methods have difficulty effectively capturing these local patterns. - **Solution**: The CQR - d method can flexibly adapt to the local data structure and is especially suitable for data environments with obvious local uncertainty patterns. ### Theoretical and Empirical Results - **Theoretical Basis**: The effectiveness of CQR - d has been strictly mathematically proven, ensuring that it reaches \(1-\alpha-\epsilon\) in marginal coverage, where \(\epsilon\) is a small tolerance caused by numerical optimization. - **Empirical Research**: Through extensive simulation experiments and practical data applications (such as the diamond price data set), CQR - d shows better performance under various sample sizes, especially with significant improvement in interval width. In conclusion, the CQR - d method aims to provide a more efficient and accurate tool for constructing prediction intervals, especially for complex and heterogeneous data environments.