A Tri-training Based Transfer Learning Algorithm
Xiaobo Liu,Harry Zhang,Zhihua Cai,Guangjun Wang
DOI: https://doi.org/10.1109/ictai.2012.99
2012-01-01
Abstract:The lack of labeled training data is a common issue in many machine learning applications. Semi-supervised learning addresses this issue by self-labeling unlabelled examples. Transfer learning tackles it from a different way: borrow labeled examples from a different but related domain (source domain) by assigning weights to those examples based on their suitability on the new domain (target domain). However, it is quite challenging to figure out the suitability. In this paper, we propose a different way for utilizing the labeled examples from source domain. That is, we use them only for labelling the unlabelled examples in the target domain. In this self-labelling, we use the idea of Tri-training. We call our new algorithm: TriTransfer. In TriTransfer, three initial classifiers are generated from the source data and the originally labeled data in the target domain, and an unlabeled example is labeled and added to the labeled data for a classifier if other two classifiers agree on its label. After an expanded labeled data set is obtained, we re-train the classifier. We repeat this process until no more change can be made. At the end, the final classifier, which is a weighted combination of the three classifiers, is output. We conduct an extensive empirical study on 34 UCI datasets, which shows that TriTransfer performs better than the state-of-art algorithms TransferBoost, Tritraining, and NaiveBayes.