Polynormer: Polynomial-Expressive Graph Transformer in Linear Time

Chenhui Deng,Zichao Yue,Zhiru Zhang
2024-04-07
Abstract:Graph transformers (GTs) have emerged as a promising architecture that is theoretically more expressive than message-passing graph neural networks (GNNs). However, typical GT models have at least quadratic complexity and thus cannot scale to large graphs. While there are several linear GTs recently proposed, they still lag behind GNN counterparts on several popular graph datasets, which poses a critical concern on their practical expressivity. To balance the trade-off between expressivity and scalability of GTs, we propose Polynormer, a polynomial-expressive GT model with linear complexity. Polynormer is built upon a novel base model that learns a high-degree polynomial on input features. To enable the base model permutation equivariant, we integrate it with graph topology and node features separately, resulting in local and global equivariant attention models. Consequently, Polynormer adopts a linear local-to-global attention scheme to learn high-degree equivariant polynomials whose coefficients are controlled by attention scores. Polynormer has been evaluated on $13$ homophilic and heterophilic datasets, including large graphs with millions of nodes. Our extensive experiment results show that Polynormer outperforms state-of-the-art GNN and GT baselines on most datasets, even without the use of nonlinear activation functions.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper proposes a method to improve the scalability of the Graph Transformer model while maintaining its expressive power. Existing Graph Transformers have at least quadratic time complexity and cannot handle large-scale graphs. Although linear Graph Transformers have been proposed, their performance on some graph datasets still lags behind GNN. Therefore, the paper introduces Polynormer, a graph Transformer model with polynomial expressiveness and linear time complexity. Polynormer learns high-order polynomials and combines graph topology and node features to achieve local and global equivariant attention models, allowing it to learn high-order polynomials that are equivariant to node permutations. Experimental results demonstrate that Polynormer surpasses state-of-the-art GNN and Graph Transformer baselines on 13 homogeneous and heterogeneous graph datasets, including large-scale graphs with millions of nodes.