Deep Graph Attention Networks

Jun Kato,Airi Mita,Keita Gobara,Akihiro Inokuchi
2024-10-21
Abstract:Graphs are useful for representing various realworld objects. However, graph neural networks (GNNs) tend to suffer from over-smoothing, where the representations of nodes of different classes become similar as the number of layers increases, leading to performance degradation. A method that does not require protracted tuning of the number of layers is needed to effectively construct a graph attention network (GAT), a type of GNN. Therefore, we introduce a method called "DeepGAT" for predicting the class to which nodes belong in a deep GAT. It avoids over-smoothing in a GAT by ensuring that nodes in different classes are not similar at each layer. Using DeepGAT to predict class labels, a 15-layer network is constructed without the need to tune the number of layers. DeepGAT prevented over-smoothing and achieved a 15-layer GAT with similar performance to a 2-layer GAT, as indicated by the similar attention coefficients. DeepGAT enables the training of a large network to acquire similar attention coefficients to a network with few layers. It avoids the over-smoothing problem and obviates the need to tune the number of layers, thus saving time and enhancing GNN performance.
Machine Learning
What problem does this paper attempt to address?
The paper attempts to address the over-smoothing problem in Graph Neural Networks (GNNs). Specifically, as the number of GNN layers increases, the representations of nodes from different categories become similar, leading to a decline in performance. To overcome this issue, the paper proposes a method called "DeepGAT" for predicting the categories of nodes in deep Graph Attention Networks (GATs). By ensuring that nodes from different categories are not similar at each layer, DeepGAT avoids the over-smoothing problem and can construct a 15-layer GAT network without the need for layer tuning, thereby saving time and improving the performance of GNNs. The main contributions of the paper include: 1. **Achieving 15-layer GAT**: DeepGAT can construct a 15-layer GAT without layer tuning, with performance comparable to a 2-layer GAT. 2. **Mathematical Proof**: Based on the regenerative property of probability distributions, the paper mathematically explains why DeepGAT can avoid the over-smoothing problem and shows that the representations of nodes from different categories have less overlap in distribution. Through these methods, DeepGAT not only enables the construction of deeper GAT networks but also maintains high classification performance across multiple datasets, providing new insights and methods for the research of Graph Neural Networks.