Improving Semi-Supervised Support Vector Machines Through Unlabeled Instances Selection.

Yu-Feng Li,Zhi-Hua Zhou
DOI: https://doi.org/10.1609/aaai.v25i1.7920
2011-01-01
Abstract:Semi-supervised support vector machines (S3VMs) are a kind of popularapproaches which try to improve learning performance by exploiting unlabeleddata. Though S3VMs have been found helpful in many situations, they maydegenerate performance and the resultant generalization ability may be evenworse than using the labeled data only. In this paper, we try to reduce thechance of performance degeneration of S3VMs. Our basic idea is that, ratherthan exploiting all unlabeled data, the unlabeled instances should be selectedsuch that only the ones which are very likely to be helpful are exploited,while some highly risky unlabeled instances are avoided. We propose theS3VM-us method by using hierarchical clustering to select the unlabeledinstances. Experiments on a broad range of data sets over eighty-eightdifferent settings show that the chance of performance degeneration ofS3VM-us is much smaller than that of existing S3VMs.
What problem does this paper attempt to address?