Variable importance in regression models

Ulrike Grömping
DOI: https://doi.org/10.1002/wics.1346
2015-02-06
Wiley Interdisciplinary Reviews: Computational Statistics
Abstract:Regression analysis is one of the most‐used statistical methods. Often part of the research question is the identification of the most important regressors or an importance ranking of the regressors. Most regression models are not specifically suited for answering the variable importance question, so that many different proposals have been made. This article reviews in detail the various variable importance metrics for the linear model, particularly emphasizing variance decomposition metrics. All linear model metrics are illustrated by an example analysis. For nonlinear parametric models, several principles from linear models have been adapted, and machine‐learning methods have their own set of variable importance methods. These are also briefly covered. Although there are many variable importance metrics, there is still no convincing theoretical basis for them, and they all have a heuristic touch. Nevertheless, some metrics are considered useful for a crude assessment in the absence of a good subject matter theory. WIREs Comput Stat 2015, 7:137–152. doi: 10.1002/wics.1346 This article is categorized under: Statistical and Graphical Methods of Data Analysis > Multivariate Analysis
What problem does this paper attempt to address?