Training Classifiers under Covariate Shift by Constructing the Maximum Consistent Distribution Subset

Xu Yu,Miao Yu,Li-xun Xu,Jing Yang,Zhi-qiang Xie
DOI: https://doi.org/10.1155/2015/302815
IF: 1.43
2015-01-01
Mathematical Problems in Engineering
Abstract:The assumption that the training and testing samples are drawn from the same distribution is violated under covariate shift setting, and most algorithms for the covariate shift setting try to first estimate distributions and then reweight samples based on the distributions estimated. Due to the difficulty of estimating a correct distribution, previous methods can not get good classification performance. In this paper, we firstly present two types of covariate shift problems. Rather than estimating the distributions, we then desire an effective method to select a maximum subset following the target testing distribution based on feature space split from the auxiliary set or the target training set. Finally, we prove that our subset selection method can consistently deal with both scenarios of covariate shift. Experimental results demonstrate that training a classifier with the selected maximum subset exhibits good generalization ability and running efficiency over those of traditional methods under covariate shift setting.
What problem does this paper attempt to address?