GCNH: A Simple Method For Representation Learning On Heterophilous Graphs

Andrea Cavallo,Claas Grohnfeldt,Michele Russo,Giulio Lovisotto,Luca Vassio
DOI: https://doi.org/10.1109/IJCNN54540.2023.10191196
2023-04-21
Abstract:Graph Neural Networks (GNNs) are well-suited for learning on homophilous graphs, i.e., graphs in which edges tend to connect nodes of the same type. Yet, achievement of consistent GNN performance on heterophilous graphs remains an open research problem. Recent works have proposed extensions to standard GNN architectures to improve performance on heterophilous graphs, trading off model simplicity for prediction accuracy. However, these models fail to capture basic graph properties, such as neighborhood label distribution, which are fundamental for learning. In this work, we propose GCN for Heterophily (GCNH), a simple yet effective GNN architecture applicable to both heterophilous and homophilous scenarios. GCNH learns and combines separate representations for a node and its neighbors, using one learned importance coefficient per layer to balance the contributions of center nodes and neighborhoods. We conduct extensive experiments on eight real-world graphs and a set of synthetic graphs with varying degrees of heterophily to demonstrate how the design choices for GCNH lead to a sizable improvement over a vanilla GCN. Moreover, GCNH outperforms state-of-the-art models of much higher complexity on four out of eight benchmarks, while producing comparable results on the remaining datasets. Finally, we discuss and analyze the lower complexity of GCNH, which results in fewer trainable parameters and faster training times than other methods, and show how GCNH mitigates the oversmoothing problem.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the poor performance of graph neural networks (GNNs) on heterophilous graphs. Specifically, existing GNN architectures do not perform well when dealing with neighbors of different node types, especially in heterophilous graphs where neighboring nodes often have different labels or features. This leads to a decline in the model's prediction accuracy. ### Specific problem description in the paper 1. **Limitations of existing GNNs**: - Standard GNN architectures (such as GCN) perform excellently on homophilous graphs but have unstable performance on heterophilous graphs. - Neighboring nodes in heterophilous graphs may be quite different from the central node, resulting in the introduction of noise during the information - passing process and affecting the final node representation. 2. **Deficiencies of existing improvement methods**: - Some studies improve the performance of GNNs on heterophilous graphs by modifying the graph structure or adjusting the message aggregation strategy, but these methods often increase the model complexity and fail to fully utilize basic graph properties (such as neighborhood label distribution). 3. **Research objectives**: - Propose a simple and effective GNN architecture that can perform well on both heterophilous and homophilous graphs. - By learning the independent representations of nodes and their neighbors and flexibly adjusting their contribution ratios to better adapt to different types of graphs. ### Overview of the solution To address the above problems, the author proposes GCNH (GCN for Heterophily), and its main design features include: - **Separate encoding**: Use two different multi - layer perceptrons (MLP) to encode the central node and its neighbors respectively, avoiding directly mixing different types of neighbor information together. - **Learnable importance coefficient**: Introduce a learnable parameter β to balance the contributions of the central node and neighborhood information, thereby dynamically adjusting the importance of both according to the specific situation. Through these designs, GCNH can more effectively capture the information of nodes and their neighbors on heterophilous graphs while maintaining low complexity and a fast training speed. ### Experimental verification The author conducted extensive experiments on multiple real - world and synthetic datasets. The results show that GCNH significantly outperforms existing GNN models on heterophilous graphs and can also achieve performance comparable to existing models on homophilous graphs. In addition, GCNH also demonstrates its advantage in preventing the over - smoothing problem. ### Summary This paper solves the problem of poor performance of GNNs on heterophilous graphs by proposing GCNH, providing a simple and effective method for achieving consistent and efficient node representation learning on different types of graphs.