Combining Link and Content for Collective Active Learning

Lixin Shi,Yuhang Zhao,Jie Tang
DOI: https://doi.org/10.1145/1871437.1871740
2010-01-01
Abstract:In this paper, we study a novel problem Collective Active Learning, in which we aim to select a batch set of "informative" instances from a networking data set to query the user in order to improve the accuracy of the learned classification model. We perform a theoretical investigation of the problem and present three criteria (i.e., minimum redundancy, maximum uncertainty and maximum impact) to quantify the informativeness of a set of selected instances. We define an objective function based on the three criteria and present an efficient algorithm to optimize the objective function with a bounded approximation rate. Experimental results on a real-world data sets demonstrate the effectiveness of our proposed approach.
What problem does this paper attempt to address?