Abstract:When multiple models are considered in regression problems, the model averaging method can be used to weigh and integrate the models. In the present study, we examined how the goodness-of-prediction of the estimator depends on the dimensionality of explanatory variables when using a generalization of the model averaging method in a linear model. We specifically considered the case of high-dimensional explanatory variables, with multiple linear models deployed for subsets of these variables. Consequently, we derived the optimal weights that yield the best predictions. we also observe that the double-descent phenomenon occurs in the model averaging estimator. Furthermore, we obtained theoretical results by adapting methods such as the random forest to linear regression models. Finally, we conducted a practical verification through numerical experiments.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to explore how the predictive performance of Model Averaging Estimators depends on the dimension of explanatory variables in a high - dimensional data environment. Specifically, the researchers considered the case of model averaging using subset variables in multiple linear models and derived the optimal weights to obtain the best prediction results. In addition, they also observed the "double descent" phenomenon in the model averaging estimators and verified it through numerical experiments. ### Main contributions 1. **Accurate analysis of model averaging estimators**: - Using Random Matrix Theory (RMT), the researchers calculated and described the predictive performance of model averaging estimators in linear models under the assumption of sample isotropy. - The research results show that the "double descent" phenomenon also occurs when using this estimator. - The asymptotic behavior of each model when randomly selecting samples and features in a high - dimensional environment was derived. 2. **Optimal weights**: - The model averaging estimator consists of a weight vector and multiple minimum - norm least - squares estimators. The weight vector can be optimized to achieve the best prediction of the true value. - The researchers derived the exact theoretical curve of the prediction risk of this estimator and obtained the optimal weight vector according to the conditions assumed in the study. ### High - dimensional asymptotic framework The researchers assumed the following high - dimensional asymptotic conditions: 1. **Data generation**: The elements of data \( X_n\in\mathbb{R}^{n\times p} \) are independently and identically distributed, satisfying \( E[X_{n,ij}] = 0 \), \( \text{Var}[X_{n,ij}] = 1 \), and \( E[|X_{n,ij}|^{12 + \omega}]<\infty \) (where \( \omega>0 \)). 2. **Sample size and dimension**: The sample size \( n\rightarrow\infty \), the dimension \( p\rightarrow\infty \), and the ratio \( p / n\rightarrow\gamma>0 \). 3. **Dimension of candidate models**: The dimension of candidate models \( |S_n^i|\rightarrow\infty \), \( |S_n^i\cap S_n^j|\rightarrow\infty \) as \( n\rightarrow\infty \); conversely, \( |S_n^i|/n\rightarrow\gamma_i>0 \), \( |S_n^i\cap S_n^j|/n\rightarrow\gamma_{ij}>0 \) for any \( i, j \). 4. **Samples used by candidate models**: The number of samples used by candidate models \( |T_n^i|\rightarrow\infty \) as \( n\rightarrow\infty \); conversely, \( |T_n^i|/n\rightarrow\eta_i>0 \), \( |T_n^i\cap T_n^j|/n\rightarrow\eta_{ij}>0 \) for any \( i \). 5. **Weight vector**: The weight vector \( w_n\in\mathbb{R}^m \) satisfies \( \sum_{i = 1}^m w_{n,i}=1 \), and converges to the weight vector \( w\in\mathbb{R}^m \) when \( n\rightarrow\infty \), \( p\rightarrow\infty \), \( p / n\rightarrow\gamma \). ### Related work - **Model averaging estimators in linear regression**: Many studies have explored methods of estimating target variables in multiple linear models through weighted sums, and the weights are usually determined by various model selection criteria. - **Random forests and distributed learning**: The researchers also considered the case of classifying data on sample and feature indices, and this method is similar to random forests and distributed learning

On High-Dimensional Asymptotic Properties of Model Averaging Estimators

A Model-Averaging Approach for High-Dimensional Regression

Model Averaging with High-Dimensional Dependent Data

A Scalable Frequentist Model Averaging Method

Penalized Time-Varying Model Averaging

On Asymptotic Optimality of Least Squares Model Averaging When True Model Is Included

Model Averaging Estimation for Nonparametric Varying-Coefficient Models with Multiplicative Heteroscedasticity

Model averaging for multivariate multiple regression models

Stability and L2-penalty in Model Averaging

Sequential Model Averaging for High Dimensional Linear Regression Models

Ultra-High Dimensional Model Averaging for Multi-Categorical Response

Optimal Model Averaging for Divergent-Dimensional Poisson Regressions

Model averaging in a multiplicative heteroscedastic model

Partial Linear Model Averaging Prediction for Longitudinal Data

Model Averaging-Based Sufficient Dimension Reduction

Unified Optimal Model Averaging with a General Loss Function Based on Cross-Validation

Optimal Model Averaging of Mixed-Data Kernel-Weighted Spline Regressions

Model Averaging for Nonlinear Regression Models

Parsimonious Model Averaging With a Diverging Number of Parameters

Model Averaging Based on Generalized Method of Moments

Model Averaging by Cross-validation for Partially Linear Functional Additive Models