Conformal Prediction for Multimodal Regression

Alexis Bose,Jonathan Ethier,Paul Guinand
2024-10-28
Abstract:This paper introduces multimodal conformal regression. Traditionally confined to scenarios with solely numerical input features, conformal prediction is now extended to multimodal contexts through our methodology, which harnesses internal features from complex neural network architectures processing images and unstructured text. Our findings highlight the potential for internal neural network features, extracted from convergence points where multimodal information is combined, to be used by conformal prediction to construct prediction intervals (PIs). This capability paves new paths for deploying conformal prediction in domains abundant with multimodal data, enabling a broader range of problems to benefit from guaranteed distribution-free uncertainty quantification.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to apply Conformal Prediction (CP) in multi - modal regression tasks, especially in complex scenarios dealing with heterogeneous input features such as tabular data, unstructured text, and images. Traditional Conformal Prediction methods are mainly applicable to cases where only numerical input features are included, and they face significant challenges when dealing with multi - modal data. This paper proposes an innovative method that uses neural network internal features for Conformal Prediction, thereby generating Prediction Intervals (PIs) to quantify the uncertainty of model predictions. ### Main Contributions 1. **Expand the application scope of Conformal Prediction**: Expand Conformal Prediction from single - numerical input features to multi - modal data (such as images, unstructured text, and tabular data), enabling a wider range of problems to benefit from distribution - free uncertainty quantification. 2. **Utilize internal features**: By extracting the features of the internal combination points of the neural network, these features have been screened and weighted and are suitable for distance - based Conformal Prediction. 3. **Verify the effectiveness of the new method**: The effectiveness and feasibility of the new method in multi - modal regression tasks are verified through two experimental cases (RSRP prediction and price prediction). ### Experimental Setup - **DTU RSRP Regression Model**: Predict the Received Signal Received Power (RSRP) using satellite images and tabular features, and evaluate the performance of prediction intervals under different feature combinations. - **Multi - modal Toolkit: Price Regression Model**: Use the dataset from Airbnb to evaluate the performance of different internal features (BERT output, MLP output, etc.) in price prediction. ### Results and Conclusions - **RSRP Prediction**: The prediction intervals generated by internal and external features have similar effects, but the external features are slightly better. This indicates that although internal features can be used for Conformal Prediction, they may not be as effective as external features in some cases. - **Price Prediction**: The performance of internal features (especially BERT output) is close to that of external features, but does not exceed it. This may be due to the small size of the test set or the high uncertainty of the model itself. In conclusion, this research demonstrates the potential of using neural network internal features for Conformal Prediction in multi - modal regression tasks and provides new ideas and directions for future research.