Multiview Subgraph Neural Networks: Self-Supervised Learning With Scarce Labeled Data

Zhenzhong Wang,Qingyuan Zeng,Wanyu Lin,Min Jiang,Kay Chen Tan
DOI: https://doi.org/10.1109/TNNLS.2024.3443074
2024-10-22
Abstract:While graph neural networks (GNNs) have become the de facto standard for graph-based node classification, they impose a strong assumption on the availability of sufficient labeled samples. This assumption restricts the classification performance of prevailing GNNs on many real-world applications suffering from low-data regimes. Specifically, features extracted from scarce labeled nodes could not provide sufficient supervision for the unlabeled samples, leading to severe overfitting. We point out that leveraging subgraphs to capture long-range dependencies can augment the node representation, thus alleviating the low-data regime. To this end, we present a novel self-supervised learning (SSL) framework, called multiview subgraph neural networks ( Muse), for handling the long-range dependencies. In particular, we propose an information theory-based identification mechanism to identify two types of subgraphs from the views of input space and latent space, respectively. The former is to capture the local structure of the graph, while the latter captures the long-range dependencies among nodes. By fusing these two views of subgraphs, the learned representations can preserve the topological properties of the graph at large, including the local structure and long-range dependencies, thus maximizing their expressiveness. Theoretically, we provide the generalization error bound to show the effectiveness of capturing complementary information from multiview subgraphs. Empirically, we show a proof-of-concept of Muse on canonical node classification problems on graph data.
What problem does this paper attempt to address?