Toward Effective Semi-supervised Node Classification with Hybrid Curriculum Pseudo-labeling
Xiao Luo,Wei Ju,Yiyang Gu,Yifang Qin,Siyu Yi,Daqing Wu,Luchen Liu,Ming Zhang
DOI: https://doi.org/10.1145/3626528
IF: 4.094
2024-01-01
ACM Transactions on Multimedia Computing Communications and Applications
Abstract:Semi-supervised node classification is a crucial challenge in relational data mining and has attracted increasing interest in research on graph neural networks (GNNs). However, previous approaches merely utilize labeled nodes to supervise the overall optimization, but fail to sufficiently explore the information of their underlying label distribution. Even worse, they often overlook the robustness of models, which may cause instability of network outputs to random perturbations. To address the aforementioned shortcomings, we develop a novel framework termed Hybrid Curriculum Pseudo-Labeling (HCPL) for efficient semi-supervised node classification. Technically, HCPL iteratively annotates unlabeled nodes by training a GNN model on the labeled samples and any previously pseudo-labeled samples, and repeatedly conducts this process. To improve the model robustness, we introduce a hybrid pseudo-labeling strategy that incorporates both prediction confidence and uncertainty under random perturbations, therefore mitigating the influence of erroneous pseudo-labels. Finally, we leverage the idea of curriculum learning to start from annotating easy samples, and gradually explore hard samples as the iteration grows. Extensive experiments on a number of benchmarks demonstrate that our HCPL beats various state-of-the-art baselines in diverse settings.