Towards Deep Attention in Graph Neural Networks: Problems and Remedies

Soo Yong Lee,Fanchen Bu,Jaemin Yoo,Kijung Shin
2023-06-04
Abstract:Graph neural networks (GNNs) learn the representation of graph-structured data, and their expressiveness can be further enhanced by inferring node relations for propagation. Attention-based GNNs infer neighbor importance to manipulate the weight of its propagation. Despite their popularity, the discussion on deep graph attention and its unique challenges has been limited. In this work, we investigate some problematic phenomena related to deep graph attention, including vulnerability to over-smoothed features and smooth cumulative attention. Through theoretical and empirical analyses, we show that various attention-based GNNs suffer from these problems. Motivated by our findings, we propose AEROGNN, a novel GNN architecture designed for deep graph attention. AERO-GNN provably mitigates the proposed problems of deep graph attention, which is further empirically demonstrated with (a) its adaptive and less smooth attention functions and (b) higher performance at deep layers (up to 64). On 9 out of 12 node classification benchmarks, AERO-GNN outperforms the baseline GNNs, highlighting the advantages of deep graph attention. Our code is available at <a class="link-external link-https" href="https://github.com/syleeheal/AERO-GNN" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is the decline in expressive power of existing attention-based Graph Neural Networks (GNNs) in deep networks. Specifically, the paper focuses on two main issues: 1. **Vulnerability to over-smooth features**: When node features become overly smooth in deep networks, existing attention mechanisms fail to effectively distinguish relationships between different nodes. 2. **Smoothness of accumulated attention**: As the number of network layers increases, the accumulated attention matrix becomes overly smooth, leading to the gradual disappearance of attention differences between different nodes. These issues limit the performance of existing attention-based GNNs in deep networks, especially when dealing with complex graph-structured data. To address these challenges, the paper proposes a new GNN architecture—AERO-GNN, which aims to improve the expressive power and performance of deep graph neural networks by enhancing the attention mechanism.