Active Learning with Multi-granular Graph Auto-Encoder

Yi He,Xu Yuan,Nian-Feng Tzeng,Xindong Wu
DOI: https://doi.org/10.1109/icdm50108.2020.00125
2020-01-01
Abstract:Predictive modeling of networked data finds many real-world applications, such as fraud detection in social networks, drug discovery in biomedical networks, paper topic classification in citation networks, and so forth. Although the advanced machine learning approaches can help build reasonably accurate predictive models, their applicability is immensely hindered by the data labeling tasks, which are onerous, time-consuming, and error-prone. In this paper, we propose a novel active learning paradigm for networked data, named topology-and-content-aware (TACA) active learning, aiming to minimize the number of labels while achieving a desirable level of model accuracy. Overall, TACA advances existing works from two aspects: (1) TACA makes no assumption on the network property, whereas most existing works only perform effectively on a locally consistent network in which linked nodes are expected to share the same labels and (2) TACA generates queries without relying on model performance, thereby enjoying robust predictive results even when noises exist in the queried labels. Both theoretical and empirical evidences are presented, substantiating the effectiveness of and optimism our approach.
What problem does this paper attempt to address?