Abstract:In this work, we generalize the ideas of Kaiming initialization to Graph Neural Networks (GNNs) and propose a new scheme (G-Init) that reduces oversmoothing, leading to very good results in node and graph classification tasks. GNNs are commonly initialized using methods designed for other types of Neural Networks, overlooking the underlying graph topology. We analyze theoretically the variance of signals flowing forward and gradients flowing backward in the class of convolutional GNNs. We then simplify our analysis to the case of the GCN and propose a new initialization method. Our results indicate that the new method (G-Init) reduces oversmoothing in deep GNNs, facilitating their effective use. Experimental validation supports our theoretical findings, demonstrating the advantages of deep networks in scenarios with no feature information for unlabeled nodes (i.e., ``cold start'' scenario).

What problem does this paper attempt to address?

This paper attempts to solve the over - smoothing problem in graph neural networks (GNNs). Specifically, as the number of layers in the GNN increases, the node representations become too similar, thus losing the initial information and leading to a decline in model performance. To solve this problem, the author proposes a new weight initialization method (G - Init), which aims to stabilize the variance of signals and gradients when flowing inside the model and reduce the over - smoothing phenomenon. ### Main problems 1. **Over - smoothing problem**: As the number of GNN layers increases, the node representations gradually converge, causing the model to be unable to effectively distinguish different nodes, thus affecting the performance of classification tasks. 2. **Limitations of existing initialization methods**: Traditional weight initialization methods (such as Kaiming initialization) are designed for other types of neural networks and do not consider the influence of the graph structure in GNNs, so they are not effective in GNNs. ### Solutions The author solves the above problems through the following steps: 1. **Theoretical analysis**: The author generalizes the initialization method proposed by He et al. to convolutional GNNs and analyzes the variance changes of the forward - propagation signals and the backward - propagation gradients. 2. **New initialization method (G - Init)**: Based on the theoretical analysis, the author proposes a new initialization method, especially for the GCN model. This method controls the variance by adjusting the standard deviation of the weight matrix to prevent over - smoothing. 3. **Experimental verification**: Through experiments on multiple datasets, the effectiveness of G - Init is verified, especially in deeper GNNs. ### Key formulas - **Upper bound of forward - propagation variance**: \[ \text{V ar}[y_i^{(l)}] \leq n_l \cdot (d_i + 1(\beta \neq 0)+ 1(\gamma \neq 0))\times\left(\frac{\alpha^2}{2d_i^2}\text{V ar}[y_i^{(l - 1)}]+\frac{\gamma^2}{2}\cdot\text{V ar}[y_i^{(l - 2)}]+j(\alpha, \beta)\right)\times(\delta^2\text{V ar}[w_l]+\epsilon^2) \] where $n_l$ is the dimension of the weight matrix, $d_i$ is the degree of node $i$, $\alpha, \beta, \gamma, \delta, \epsilon$ are model parameters, and $j(\alpha, \beta)$ is the function defined in Lemma 3. - **Upper bound of backward - propagation variance**: \[ \text{V ar}[\Delta x_i^{(l)}] \leq m_w\cdot\left(\frac{\alpha^2}{d_i^2}\text{V ar}[\Delta x_i^{(l + 1)}]+q(\alpha)\right) \] where \[ m_w=\frac{1}{2n_l(d_i + 1(\gamma \neq 0))}\cdot(\delta^2\text{V ar}[w_l]+\epsilon^2) \] - **Standard deviation of G - Init initialization**: \[ \sigma=\sqrt{\frac{2d_i}{n_l}} \] Through these formulas and methods, G - Init effectively reduces the over - smoothing phenomenon and improves the performance of GNNs in node classification and graph classification tasks.

Reducing Oversmoothing through Informed Weight Initialization in Graph Neural Networks

Beyond smoothness: A general optimization framework for graph neural networks with negative Laplacian regularization

Graph Neural Networks Do Not Always Oversmooth

Demystifying Oversmoothing in Attention-Based Graph Neural Networks

On the Initialization of Graph Neural Networks

A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks

A Survey on Oversmoothing in Graph Neural Networks

Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs

Partially Trained Graph Convolutional Networks Resist Oversmoothing

Tackling Over-Smoothing for General Graph Convolutional Networks

Revisiting Over-smoothing in Deep GCNs

Preventing Over-Smoothing for Hypergraph Neural Networks

Motif-induced Graph Normalization

Node Dependent Local Smoothing for Scalable Graph Learning

Beyond Over-smoothing: Uncovering the Trainability Challenges in Deep Graph Neural Networks

Graph Information Vanishing Phenomenon in Implicit Graph Neural Networks

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

A unified deep semi-supervised graph learning scheme based on nodes re-weighting and manifold regularization

Measuring and Relieving the Over-Smoothing Problem for Graph Neural Networks from the Topological View

On the Trade-off between Over-smoothing and Over-squashing in Deep Graph Neural Networks