Variable selection for classification with derivative-induced regularization

Xin He,Shaogao Lv,Junhui Wang
DOI: https://doi.org/10.5705/ss.202018.0086
IF: 1.4
2020-01-01
Statistica Sinica
Abstract:Despite extensive research on variable selection over the past two decades, few studies exist on variable selection for classification, particularly when no assumptions are made about the model. In this paper, we propose a general variable selection framework for classification by examining the conditional probability. The proposed framework is illustrated by means of support vector machine (SVM) with derivative-induced sparsity, which makes no explicit model assumption, and takes full advantage of the mathematical properties of the reproducing kernel Hilbert space (RKHS). In contrast to many existing methods, our proposed method leads to a convex optimization task, and fully exploits gradient information by using the reproducing property of gradients in smooth RKHSs. The proposed method can also be viewed as a generalization of the classical SVM, and achieves superior empirical performance in sparse classification. Importantly, the estimation consistency and subset selection properties of the proposed method are established. Lastly, the effectiveness of the method is demonstrated using simulated and real-life examples.
What problem does this paper attempt to address?