Semi-supervised learning with extremely sparse labeled data on multiple semi-supervised assumptions

Lisong Chen,Huanhuan Chen,Ke Tang
DOI: https://doi.org/10.1109/SoCPaR.2011.6089114
2011-01-01
Abstract:Semi-supervised learning with extremely sparse labeled data focus on how to generate robust classifiers when the training dataset consists of 1%-5% labeled data and plenty of unlabeled data. Existing algorithms often suffer from this kind of problems. This paper proposes a multiple semi-supervised assumptions based approach, which typically does not suffer from the major drawback of the former for which adding very few labeled data might actually lead to a serious performance degradation. It combines three semi-supervised assumptions, i.e. smoothness, manifold and cluster assumption, and employs an unlabeled-data-oriented strategy that benefits from the perspective of unsupervised learning. Experimental results on various datasets demonstrate that our algorithm exhibits strong robustness against extremely sparse labeled data and outperforms a number of existing SSL techniques.
What problem does this paper attempt to address?