Grouped Variable Selection Using Area under the ROC with Imbalanced Data

yang li,yichen qin,limin wang,jiaxu chen,shuangge ma
DOI: https://doi.org/10.1080/03610918.2013.818691
2016-01-01
Communications in Statistics - Simulation and Computation
Abstract:Imbalanced data brings biased classification and causes the low accuracy of the classification of the minority class. In this article, we propose a methodology to select grouped variables using the area under the ROC with an adjustable prediction cut point. The proposed method enhance the accuracy of classification for the minority class by maximizing the true positive rate. Simulation results show that the proposed method is appropriate for both the categorical and continuous covariates. An illustrative example of the analysis of the SHS data in TCM is discussed to show the reasonable application of the proposed method.
What problem does this paper attempt to address?