Combining Topological Analysis Matrices-Based Active Learning on Networked Data Classification

He Xiaoqi,Yangguang Liu,Xiaogang Jin
DOI: https://doi.org/10.1117/12.888404
2010-01-01
Abstract:Active learning is an important technique to improve the learned model using unlabeled data, when labeled data is difficult to obtain, and unlabeled data is available in large quantity and easy to collect. Several instance querying strategies have been suggested recently. These works show that empirical risk minimization (ERM) can find the next instance to label effectively, but the computation time consumption is large. This paper introduces a new approach to select the best instance with less time consumption. In the case where the data is graphical in nature, we can implement the graph topological analysis to rapidly select instances that are likely to be good candidates for labeling. This paper describes an approach of using degree of a node metric to identify the best instance next to label. We experiment on Zachary's Karate Club dataset and 20 newsgroups dataset with four binary classification tasks, the results show that the strategy of degree of a node has similar performance to ERM with less time consumption.
What problem does this paper attempt to address?