Variable Selection for High Dimensional Gaussian Copula Regression Model: an Adaptive Hypothesis Testing Procedure.

Yong He,Xinsheng Zhang,Liwen Zhang
DOI: https://doi.org/10.1016/j.csda.2018.03.003
IF: 2.035
2018-01-01
Computational Statistics & Data Analysis
Abstract:In this paper we consider the variable selection problem for high dimensional Gaussian copula regression model. We transform the variable selection problem into a multiple testing problem. Compared to the existing methods depending on regularization or a stepwise algorithm, our method avoids the ambiguous relationship between the regularized parameter and the number of false discovered variables or the decision of a stopping rule. We exploit nonparametric rank-based correlation coefficient estimators to construct our test statistics which achieve robustness and adaptivity to the unknown monotone marginal transformations. We show that our multiple testing procedure can control the false discovery rate (FDR) or the average number of falsely discovered variables (FDV) asymptotically. We also propose a screening multiple testing procedure to deal with the extremely high dimensional setting. Besides theoretical analysis, we also conduct numerical simulations to compare the variable selection performance of our method with some state-of-the-art methods. The proposed method is also applied on a communities and crime unnormalized data set to illustrate its empirical usefulness.
What problem does this paper attempt to address?