Distribution Free Uncertainty for the Minimum Norm Solution of Over-parameterized Linear Regression

Koby Bibas,Meir Feder
DOI: https://doi.org/10.48550/arXiv.2102.07181
2021-06-17
Abstract:A fundamental principle of learning theory is that there is a trade-off between the complexity of a prediction rule and its ability to generalize. Modern machine learning models do not obey this paradigm: They produce an accurate prediction even with a perfect fit to the training set. We investigate over-parameterized linear regression models focusing on the minimum norm solution: This is the solution with the minimal norm that attains a perfect fit to the training set. We utilize the recently proposed predictive normalized maximum likelihood (pNML) learner which is the min-max regret solution for the distribution-free setting. We derive an upper bound of this min-max regret which is associated with the prediction uncertainty. We show that if the test sample lies mostly in a subspace spanned by the eigenvectors associated with the large eigenvalues of the empirical correlation matrix of the training data, the model generalizes despite its over-parameterized nature. We demonstrate the use of the pNML regret as a point-wise learnability measure on synthetic data and successfully observe the double-decent phenomenon of the over-parameterized models on UCI datasets.
Machine Learning
What problem does this paper attempt to address?