Variable Selection and Estimation for Regression Models with Compositional Data Predictors

Huiwen Wang,Seng Lee Huang,Liying Shangguan,Siyan Wang
2015-01-01
Abstract:As the development of computer-related technology, collection and storage of data become more and more convenient, which makes the existence of unrelated variables unavoidable in a specific problem. In this setting, various variable selection methods have been proposed in the literature, but there is little consideration about the structure among the predictors, especially about compositional data. For compositional data, the summation of nonzero components is one, which is a prior sum constraint before modelling and makes the consideration of correlation between components indispensable. In the regression model, when there are multiple compositional data and multivariate scalar data mixed in the predictors for predicting the scalar response, we propose to select the significant predictors and estimate the associated coefficients simultaneously with this mixed type of data in groups for compositional data and individuals for multivariate scalar data, where several group variable shrinkage methods can be implemented in this procedure, such as group Lasso and group SCAD. Besides, considering the robustness of estimator, we choose the quantile loss function to weaken the effect of outliers and reflect the distribution of the reesponse. Simulation and a real data example illustrate the effectiveness of our proposed method.
What problem does this paper attempt to address?