Model averaging prediction by K -fold cross-validation

Xinyu Zhang,Chu-An Liu
DOI: https://doi.org/10.1016/j.jeconom.2022.04.007
IF: 3.363
2023-07-01
Journal of Econometrics
Abstract:This paper considers the model averaging prediction in a quasi-likelihood framework that allows for parameter uncertainty and model misspecification. We propose an averaging prediction that selects the data-driven weights by minimizing a K -fold cross-validation. We provide two theoretical justifications for the proposed method. First, when all candidate models are misspecified, we show that the proposed averaging prediction using K -fold cross-validation weights is asymptotically optimal in the sense of achieving the lowest possible prediction risk. Second, when the model set includes correctly specified models, we demonstrate that the proposed K -fold cross-validation asymptotically assigns all weights to the correctly specified models. Monte Carlo simulations show that the proposed averaging prediction achieves lower empirical risk than other existing model averaging methods. As an empirical illustration, the proposed method is applied to credit card default prediction.
economics,social sciences, mathematical methods,mathematics, interdisciplinary applications
What problem does this paper attempt to address?
The paper attempts to address the problem of how to construct an optimal predictive model averaging method within the quasi-likelihood framework by selecting data-driven weights through K-fold cross-validation in the presence of parameter uncertainty and model misspecification. Specifically, the paper proposes the following two main theoretical justifications: 1. When all candidate models are misspecified, the predictive method using K-fold cross-validation weights is asymptotically optimal in the sense of the predictive risk function. 2. When the model set contains correctly specified models, K-fold cross-validation can asymptotically allocate all weights to these correctly specified models. Additionally, Monte Carlo simulation experiments show that the proposed model averaging predictive method has lower empirical risk compared to other existing model averaging methods. In empirical applications, this method is applied to credit card default prediction, and the results show that its predictive performance is superior to other existing methods.