Post‐Estimation Shrinkage in Full and Selected Linear Regression Models in Low‐Dimensional Data Revisited
Edwin Kipruto,Willi Sauerbrei
DOI: https://doi.org/10.1002/bimj.202300368
IF: 1.715
2024-09-29
Biometrical Journal
Abstract:The fit of a regression model to new data is often worse due to overfitting. Analysts use variable selection techniques to develop parsimonious regression models, which may introduce bias into regression estimates. Shrinkage methods have been proposed to mitigate overfitting and reduce bias in estimates. Post‐estimation shrinkage is an alternative to penalized methods. This study evaluates effectiveness of post‐estimation shrinkage in improving prediction performance of full and selected models. Through a simulation study, results were compared with ordinary least squares (OLS) and ridge in full models, and best subset selection (BSS) and lasso in selected models. We focused on prediction errors and the number of selected variables. Additionally, we proposed a modified version of the parameter‐wise shrinkage (PWS) approach named non‐negative PWS (NPWS) to address weaknesses of PWS. Results showed that no method was superior in all scenarios. In full models, NPWS outperformed global shrinkage, whereas PWS was inferior to OLS. In low correlation with moderate‐to‐high signal‐to‐noise ratio (SNR), NPWS outperformed ridge, but ridge performed best in small sample sizes, high correlation, and low SNR. In selected models, all post‐estimation shrinkage performed similarly, with global shrinkage slightly inferior. Lasso outperformed BSS and post‐estimation shrinkage in small sample sizes, low SNR, and high correlation but was inferior when the opposite was true. Our study suggests that, with sufficient information, NPWS is more effective than global shrinkage in improving prediction accuracy of models. However, in high correlation, small sample sizes, and low SNR, penalized methods generally outperform post‐estimation shrinkage methods.
mathematical & computational biology,statistics & probability