Effective Semi-Supervised Node Classification on Few-Labeled Graph Data
Ziang Zhou,Jieming Shi,Shengzhong Zhang,Zengfeng Huang,Qing Li
2019-01-01
Abstract:Graph neural networks (GNNs) are designed for semi-supervised node classification on graphs where only a small subset of nodes have class labels. However, under extreme cases when very few labels are available (e.g., 1 labeled node per class), GNNs suffer from severe result quality degradation. Several existing studies make an initial effort to ease this situation, but are still far from satisfactory. In this paper, on few-labeled graph data, we propose an effective framework ABN that is readily applicable to both shallow and deep GNN architectures and significantly boosts classification accuracy. In particular, on a benchmark dataset Cora with only 1 labeled node per class, while the classic graph convolutional network (GCN) only has 44.6% accuracy, an immediate instantiation of ABN over GCN achieves 62.5% accuracy; when applied to a deep architecture DAGNN, ABN improves accuracy from 59.8% to 66.4%, which is state of the art. ABN obtains superior performance through three main algorithmic designs. First, it selects high-quality unlabeled nodes via an adaptive pseudo labeling technique, so as to adaptively enhance the training process of GNNs. Second, ABN balances the labels of the selected nodes on real-world skewed graph data by pseudo label balancing. Finally, a negative sampling regularizer is designed for ABN to further utilize the unlabeled nodes. The effectiveness of the three techniques in ABN is well-validated by both theoretical and empirical analysis. Extensive experiments, comparing 12 existing approaches on 4 benchmark datasets, demonstrate that ABN achieves state-of-the-art performance. ACM Reference Format: Ziang Zhou, Jieming Shi, Shengzhong Zhang, Zengfeng Huang, and Qing Li. 2021. Effective Semi-Supervised Node Classification on Few-Labeled Graph Data . In Proceedings of SIGKDD ’21. XXXX, 12 pages. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org.