Correcting Sample Selection Bias for Image Classification

Di Wu,Dinzhong Lin,Li Ya,Wenjun Zhang
DOI: https://doi.org/10.1109/iske.2008.4731115
2008-01-01
Abstract:One of the basic assumptions in traditional machine learning is that it requires training and test data be under the same distribution. However, in image classification, this assumption often does not hold, since image labels are not as sufficient as text ones. In this paper, we propose to use labeled images from relevant but different categories to take the role of training data for estimating a prediction model. Correcting sample selection bias, the 2000 Nobel Prize work in Economic, is applied to our problem. We assume that the difference between training and test data is that they are governed by different distributions. By eliminative sample selection bias in the training data, the supervisory knowledge in the training data can be effectively learned for classifying images in the test set. We present theoretical and empirical analysis to demonstrate the effectiveness of our algorithm. The experimental results on two image corpora show that our algorithm can greatly improve several state-of-the art classifiers when the training and test images come from similar but different categories.
What problem does this paper attempt to address?