VIGraph: Generative Self-supervised Learning for Class-Imbalanced Node Classification

Yulan Hu,Sheng Ouyang,Zhirui Yang,Yong Liu
2024-03-27
Abstract:Class imbalance in graph data presents significant challenges for node classification. While existing methods, such as SMOTE-based approaches, partially mitigate this issue, they still exhibit limitations in constructing imbalanced graphs. Generative self-supervised learning (SSL) methods, exemplified by graph autoencoders (GAEs), offer a promising solution by directly generating minority nodes from the data itself, yet their potential remains underexplored. In this paper, we delve into the shortcomings of SMOTE-based approaches in the construction of imbalanced graphs. Furthermore, we introduce VIGraph, a simple yet effective generative SSL approach that relies on the Variational GAE as the fundamental model. VIGraph strictly adheres to the concept of imbalance when constructing imbalanced graphs and innovatively leverages the variational inference (VI) ability of Variational GAE to generate nodes for minority classes. VIGraph introduces comprehensive training strategies, including cross-view contrastive learning at the decoding phase to capture semantic knowledge, adjacency matrix reconstruction to preserve graph structure, and alignment strategy to ensure stable training. VIGraph can generate high-quality nodes directly usable for classification, eliminating the need to integrate the generated nodes back to the graph as well as additional retraining found in SMOTE-based methods. We conduct extensive experiments, results from which demonstrate the superiority and generality of our approach.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?