Infinite Width Graph Neural Networks for Node Regression/ Classification

Yunus Cobanoglu
DOI: https://doi.org/10.48550/arXiv.2310.08176
2023-11-21
Abstract:This work analyzes Graph Neural Networks, a generalization of Fully-Connected Deep Neural Nets on Graph structured data, when their width, that is the number of nodes in each fullyconnected layer is increasing to infinity. Infinite Width Neural Networks are connecting Deep Learning to Gaussian Processes and Kernels, both Machine Learning Frameworks with long traditions and extensive theoretical foundations. Gaussian Processes and Kernels have much less hyperparameters then Neural Networks and can be used for uncertainty estimation, making them more user friendly for applications. This works extends the increasing amount of research connecting Gaussian Processes and Kernels to Neural Networks. The Kernel and Gaussian Process closed forms are derived for a variety of architectures, namely the standard Graph Neural Network, the Graph Neural Network with Skip-Concatenate Connections and the Graph Attention Neural Network. All architectures are evaluated on a variety of datasets on the task of transductive Node Regression and Classification. Additionally, a Spectral Sparsification method known as Effective Resistance is used to improve runtime and memory requirements. Extending the setting to inductive graph learning tasks (Graph Regression/ Classification) is straightforward and is briefly discussed in 3.5.
Machine Learning
What problem does this paper attempt to address?
The problem this paper attempts to address is: analyzing the behavior of Graph Neural Networks (GNNs) as the width (i.e., the number of nodes in each fully connected layer) tends to infinity. Specifically, the paper derives closed-form expressions for the Gaussian Process (GP) and Neural Tangent Kernel (NTK) of infinitely wide GNNs on node regression and classification tasks, and evaluates the performance of these expressions on various datasets. ### Main Contributions: 1. **Closed-form Expressions**: Derived closed-form expressions for the GNN Gaussian Process (GNNGP) and Graph Neural Tangent Kernel (GNTK) on node regression and classification tasks for three different architectures (standard GNN, GNN with skip connections, and Graph Attention Network GAT). 2. **Experimental Evaluation**: Evaluated the performance of GNNGP and GNTK against their corresponding neural network models on various datasets, and applied spectral sparsification methods (such as effective resistance) to improve runtime and memory requirements. ### Background and Motivation: - **Graph Neural Networks (GNNs)**: GNNs are an extension of fully connected deep neural networks to graph-structured data, widely used in graph data processing tasks. - **Infinitely Wide Neural Networks**: In recent years, research on infinitely wide neural networks has been very active, especially the Neural Tangent Kernel (NTK) theory, which connects infinitely wide fully connected deep neural networks with kernel methods, providing a new perspective for understanding the theoretical properties of deep learning. ### Methods and Results: - **Theoretical Derivation**: Through mathematical derivation, the paper obtained closed-form expressions for the GNNGP and GNTK of infinitely wide GNNs. These expressions are not only applicable to standard GNNs but also extend to GNNs with skip connections and Graph Attention Networks (GAT). - **Experimental Validation**: Through experiments on multiple datasets, the effectiveness of GNNGP and GNTK was validated. The experimental results show that these closed-form expressions can well approximate the corresponding neural network models and, in some cases, have better performance. ### Significance and Impact: - **Theoretical Foundation**: The paper provides a solid theoretical foundation for understanding the behavior of GNNs under infinite width conditions, which helps further research on the generalization ability and optimization characteristics of GNNs. - **Practical Application**: By introducing spectral sparsification methods, the proposed methods in the paper can effectively reduce the demand for computational resources, making the application of GNNs on large-scale graph data possible. In summary, through theoretical derivation and experimental validation, this paper deeply explores the properties of infinitely wide GNNs and provides important references for the theoretical research and practical application of GNNs.