IGCN: A Provably Informative GCN Embedding for Semi-Supervised Learning With Extremely Limited Labels

Lin Zhang,Ran Song,Wenhao Tan,Lin Ma,Wei Zhang
DOI: https://doi.org/10.1109/TPAMI.2024.3404655
Abstract:Graph Neural Networks (GNNs) have gained much more attention in the representation learning for the graph-structured data. However, the labels are always limited in the graph, which easily leads to the overfitting problem and causes the poor performance. To solve this problem, we propose a new framework called IGCN, short for Informative Graph Convolutional Network, where the objective of IGCN is designed to obtain the informative embeddings via discarding the task-irrelevant information of the graph data based on the mutual information. As the mutual information for irregular data is intractable to compute, our framework is optimized via a surrogate objective, where two terms are derived to approximate the original objective. For the former term, it demonstrates that the mutual information between the learned embeddings and the ground truth should be high, where we utilize the semi-supervised classification loss and the prototype based supervised contrastive learning loss for optimizing it. For the latter term, it requires that the mutual information between the learned node embeddings and the initial embeddings should be high and we propose to minimize the reconstruction loss between them to achieve the goal of maximizing the latter term from the feature level and the layer level, which contains the graph encoder-decoder module and a novel architecture GCN Info. Moreover, we provably show that the designed GCN Info can better alleviate the information loss and preserve as much useful information of the initial embeddings as possible. Experimental results show that the IGCN outperforms the state-of-the-art methods on 7 popular datasets.
What problem does this paper attempt to address?