Variable Selection in Joint Modelling of the Mean and Variance for Hierarchical Data

Christiana Charalambous,Jianxin Pan,Mark Tranmer
DOI: https://doi.org/10.1177/1471082x13520424
2014-01-01
Statistical Modelling
Abstract:We propose to extend the use of penalized likelihood variable selection to hierarchical generalized linear models (HGLMs) for jointly modelling the mean and variance structures. We assume a two-level hierarchical data structure, with subjects nested within groups. A generalized linear mixed model (GLMM) is fitted for the mean, with a structured dispersion in the form of a generalized linear model (GLM) for the between-group variation. To do variable selection, we use the smoothly clipped absolute deviation (SCAD) penalty, which simultaneously shrinks the coefficients of redundant variables to 0 and estimates the coefficients of the remaining important covariates. We run simulation studies and real data analysis for the joint mean–variance models, to assess the performance of the proposed procedure against a similar process which excludes variable selection. The results indicate that our method can successfully identify the zero/non-zero components in our models and can also significantly improve the efficiency of the resulting penalized estimates.
What problem does this paper attempt to address?