Gaussian Process Versus Margin Sampling Active Learning

Jin Zhou,Shiliang Sun
DOI: https://doi.org/10.1016/j.neucom.2015.04.086
IF: 6
2015-01-01
Neurocomputing
Abstract:There are a large number of unlabeled examples in real-world application, and if the labels of these unlabeled examples are given manually, then the cost will be very high. The problem about how to label these massive unlabeled instances with the minimal cost is paid more and more attention. Active learning efficiently solves this bottleneck by selecting the most informative examples from the unlabeled examples and establishing a classifier with a higher classifier accuracy to label unlabeled examples, which greatly improves work efficiency. In this paper, we compare two kinds of traditional active learning algorithms relying on a single classifier, namely Gaussian process and margin sampling active learning, in two aspects of classification error rates and computing time. Moreover, we compare their improved versions (GPMAL and IMS) which apply the manifold-preserving graph reduction (MPGR) algorithm. MPGR constructs a subset which well exploits the structural spatial connectivity and spatial diversity among examples. By using MPGR, an active learner selects the informative and representative candidates from the subset instead of the whole unlabeled data set. In addition, a comparison with a state-of-the-art active learning method, QUIRE, is provided. Experimental results on multiple data sets show that both GPMAL and IMS have their own advantages.
What problem does this paper attempt to address?