Graph Neural Networks with Precomputed Node Features

Beni Egressy,Roger Wattenhofer
DOI: https://doi.org/10.48550/arXiv.2206.00637
2022-09-17
Abstract:Most Graph Neural Networks (GNNs) cannot distinguish some graphs or indeed some pairs of nodes within a graph. This makes it impossible to solve certain classification tasks. However, adding additional node features to these models can resolve this problem. We introduce several such augmentations, including (i) positional node embeddings, (ii) canonical node IDs, and (iii) random features. These extensions are motivated by theoretical results and corroborated by extensive testing on synthetic subgraph detection tasks. We find that positional embeddings significantly outperform other extensions in these tasks. Moreover, positional embeddings have better sample efficiency, perform well on different graph distributions and even outperform learning with ground truth node positions. Finally, we show that the different augmentations perform competitively on established GNN benchmarks, and advise on when to use them.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the problem of the limited ability of Graph Neural Networks (GNNs) in distinguishing certain graphs or certain pairs of nodes in graphs. Specifically, most of the existing GNN architectures are based on the message - passing framework, and these models have limitations when dealing with certain classification tasks because they cannot distinguish some very simple graph structures (such as regular graphs). This makes GNNs perform poorly when handling these specific tasks. To solve this problem, the author introduced several enhancement methods to improve the expressive power of GNNs by adding additional node features. These enhancement methods include: 1. **Positional Node Embeddings**: By pre - computing the position information of nodes, GNNs can better understand the topological structure of the graph. 2. **Canonical Node IDs**: By standardizing node IDs, ensure that the node IDs remain consistent under different permutations of the same graph. 3. **Random Features**: By introducing random features, increase the diversity of the model, thereby improving its discrimination ability. These enhancement methods are not only theoretically proven to be able to enhance the expressive power of GNNs, but also have been empirically supported in multiple synthetic sub - graph detection tasks and benchmark tests. In particular, positional node embeddings have shown significant advantages in experiments, with better sample efficiency and excellent performance on different graph distributions. ### Formula and Symbol Explanation - The graph \(G=(V, E)\) represents an unweighted, undirected and connected graph, where \(V\) is the set of nodes and \(E\) is the set of edges. - The neighborhood of node \(v\), \(N(v)=\{u\in G\mid e_{uv}\in E\}\). - The degree of node \(v\), \(\text{deg}(v) = |N(v)|\). - The diameter \(D\) of the graph is the length of the longest shortest path between any two nodes. - Message - passing formula: \[ h_v^{(0)}=x_v \] \[ a_v^{(t)}=\text{AGGREGATE}(\{h_u^{(t - 1)}\mid u\in N(v)\}) \] \[ h_v^{(t)}=\text{UPDATE}(h_v^{(t - 1)}, a_v^{(t)}) \] \[ y_v=\text{READOUT}(h_v^{(t)}) \] ### Conclusion By introducing the above - mentioned enhancement methods, especially positional node embeddings, GNNs can show stronger discrimination ability and higher sample efficiency when dealing with complex graph structures. This provides an effective solution to the limitations of existing GNNs in certain tasks.