Exploiting Ensemble Method in Semi-Supervised Learning

Jiao Wang,Siwei Luo
DOI: https://doi.org/10.1109/ICMLC.2006.258568
2006-01-01
Abstract:In many practical machine learning fields, obtaining labeled data is hard and expensive. Semi-supervised learning is very useful in these fields since it combines labeled and unlabeled data to boost performance of learning algorithms. Many semi-supervised learning algorithms have been proposed, among which the "co-training" algorithms are widely used. We present a new co-training strategy. It uses random subspace method to form an initial ensemble of classifiers, where each classifier is trained with different subspace of the original feature space. Unlike the prior work of Blum and Mitchell on co-training, using two redundant and sufficient views, our method uses an ensemble of classifiers. Each classifier's predictions on new unlabeled data are combined and used to enlarge the training set of others. The ensemble classifiers are refined through the enlarged training set. Experiments on UCI data sets show that when the number of labeled data is relatively small, our method performs better than the data dimensionality
What problem does this paper attempt to address?