GraphPatcher: Mitigating Degree Bias for Graph Neural Networks via Test-time Augmentation

Mingxuan Ju,Tong Zhao,Wenhao Yu,Neil Shah,Yanfang Ye
2023-10-02
Abstract:Recent studies have shown that graph neural networks (GNNs) exhibit strong biases towards the node degree: they usually perform satisfactorily on high-degree nodes with rich neighbor information but struggle with low-degree nodes. Existing works tackle this problem by deriving either designated GNN architectures or training strategies specifically for low-degree nodes. Though effective, these approaches unintentionally create an artificial out-of-distribution scenario, where models mainly or even only observe low-degree nodes during the training, leading to a downgraded performance for high-degree nodes that GNNs originally perform well at. In light of this, we propose a test-time augmentation framework, namely GraphPatcher, to enhance test-time generalization of any GNNs on low-degree nodes. Specifically, GraphPatcher iteratively generates virtual nodes to patch artificially created low-degree nodes via corruptions, aiming at progressively reconstructing target GNN's predictions over a sequence of increasingly corrupted nodes. Through this scheme, GraphPatcher not only learns how to enhance low-degree nodes (when the neighborhoods are heavily corrupted) but also preserves the original superior performance of GNNs on high-degree nodes (when lightly corrupted). Additionally, GraphPatcher is model-agnostic and can also mitigate the degree bias for either self-supervised or supervised GNNs. Comprehensive experiments are conducted over seven benchmark datasets and GraphPatcher consistently enhances common GNNs' overall performance by up to 3.6% and low-degree performance by up to 6.5%, significantly outperforming state-of-the-art baselines. The source code is publicly available at <a class="link-external link-https" href="https://github.com/jumxglhf/GraphPatcher" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper aims to address the performance bias of Graph Neural Networks (GNNs) when handling low-degree nodes. Specifically, existing GNNs typically perform well on high-degree nodes because these nodes have rich neighbor information, but they perform poorly on low-degree nodes due to the lack of neighbor information. This bias leads to a significant performance drop for GNNs when dealing with low-degree nodes. ### Background and Motivation 1. **Limitations of Existing Research**: - Existing methods improve the performance of low-degree nodes by designing specific GNN architectures or training strategies, but these methods inadvertently create an artificial out-of-distribution scenario, where the model mainly or only observes low-degree nodes during training, leading to a performance drop on high-degree nodes. - These methods usually require changes to the model architecture, which may be impractical in real-world applications due to the high cost of retraining large graphs and the widespread use of models for various functions in production environments. 2. **Contributions of the Paper**: - Proposes a test-time augmentation framework called GRAPH PATCHER, which can mitigate degree bias without changing the model architecture. - GRAPH PATCHER not only improves the performance of low-degree nodes but also maintains the original advantages of GNNs on high-degree nodes. - The framework is model-agnostic and can be applied to both self-supervised and supervised GNNs. ### Method Overview 1. **Basic Idea**: - GRAPH PATCHER repairs the neighborhood information of low-degree nodes by generating virtual nodes, thereby gradually reconstructing the predictions of the target GNN. - The framework iteratively generates virtual nodes to gradually repair the corrupted nodes, allowing the GNN to perform better on low-degree nodes while maintaining performance on high-degree nodes. 2. **Technical Details**: - **Node Repair**: Given a trained GNN and a graph, GRAPH PATCHER repairs the corrupted ego-graph (subgraph centered on a node) by generating virtual nodes. - **Optimization Process**: The repair process is optimized by minimizing the Kullback-Leibler divergence (KL divergence) to ensure that the repaired graph's predictions are similar to the original graph's predictions. - **Multi-step Iteration**: GRAPH PATCHER generates virtual nodes through multiple iterations, gradually repairing the corrupted ego-graph, thereby learning how to repair nodes under different degrees of corruption. ### Experimental Results 1. **Datasets**: - Comprehensive experiments were conducted on seven benchmark datasets, including Cora, Citeseer, Pubmed, Wiki.CS, Amazon-Photo, Coauthor-CS, ogbn-arxiv, Actor, and Chameleon. 2. **Baseline Methods**: - Compared with six state-of-the-art graph learning frameworks, including frameworks specifically for low-degree nodes (e.g., TAIL-GNN, COLBBREW, TUNEUP), frameworks for handling out-of-distribution scenarios (e.g., EERM, GTRANS), and training-time data augmentation frameworks (e.g., DROPEDGE). 3. **Performance Comparison**: - GRAPH PATCHER consistently improved the performance of low-degree nodes across all datasets, with an average improvement of 2.23 percentage points. - At the same time, GRAPH PATCHER maintained the original advantages of GNNs on high-degree nodes, with an overall performance improvement of 1.4 percentage points on average. ### Conclusion The paper proposes an effective test-time augmentation framework, GRAPH PATCHER, which can mitigate the performance bias of GNNs when handling low-degree nodes without changing the model architecture, while maintaining performance on high-degree nodes. This method provides a new approach for improving the performance of Graph Neural Networks in real-world applications.