Over-smoothing Effect of Graph Convolutional Networks

Fang Sun
DOI: https://doi.org/10.48550/arXiv.2201.12830
2022-02-01
Abstract:Over-smoothing is a severe problem which limits the depth of Graph Convolutional Networks. This article gives a comprehensive analysis of the mechanism behind Graph Convolutional Networks and the over-smoothing effect. The article proposes an upper bound for the occurrence of over-smoothing, which offers insight into the key factors behind over-smoothing. The results presented in this article successfully explain the feasibility of several algorithms that alleviate over-smoothing.
Machine Learning
What problem does this paper attempt to address?
This paper attempts to solve the over - smoothing problem in Graph Convolutional Networks (GCN). Specifically, the author first observes through experiments that as the number of GCN layers increases, its performance drops rapidly, and this phenomenon is called over - smoothing. Over - smoothing means that after multiple rounds of Laplacian smoothing, node features become indistinguishable, thus affecting the classification performance of the model. In order to gain a deep understanding of this problem, the main objectives of the paper include: 1. **Give a mathematical definition of "smoothness"**: Describe the smoothness degree of node features through a mathematical formula. 2. **Analyze the mechanism of over - smoothing and the scenarios in which it occurs**: Explore the specific reasons leading to over - smoothing, especially in which cases over - smoothing is likely to occur. 3. **Explain different methods to alleviate over - smoothing**: Introduce and analyze several existing effective techniques to alleviate over - smoothing, such as Graph Sparsification, Residual Connections, etc. ### Mathematical Definition and Analysis #### 1. Mathematical Definition of Smoothness The author proposes a smoothness measurement method based on topological information. Specifically, an \(\epsilon\)-smoothness index is defined: \[ \text{If there exists a layer } L, \text{ such that for any hidden layer } l > L, \text{ the output feature } H^{(l)} \text{ has a distance less than } \epsilon \text{ from the subspace } M, \text{ then the GCN is said to have } \epsilon\text{-smoothness}. \] That is: \[ \exists L, \forall l \geq L, d_M(H^{(l)}) < \epsilon \] where \(d_M(H^{(l)})\) represents the distance from \(H^{(l)}\) to the subspace \(M\), and \(M\) is the space spanned by some eigenvectors of the graph. #### 2. Mechanism of Over - smoothing The author proves that over - smoothing is inevitable through spectral analysis methods. Specifically, for a connected graph \(G\), the normalized Laplacian matrix \(S\) satisfies: \[ \lim_{k \to \infty} S^k = \Pi \] where \(\Pi = \Phi(\tilde{D}^{1/2} e^\top) (\Phi(\tilde{D}^{1/2} e^\top))^\top\), \(\Phi(x) = \frac{x}{\|x\|}\). This shows that as the number of layers increases, node features will gradually converge to the same value, resulting in over - smoothing. #### 3. Methods to Alleviate Over - smoothing The paper discusses several effective methods to alleviate over - smoothing: - **Graph Sparsification**: Slow down the speed of information transfer by randomly deleting edges in the graph, such as the DropEdge method. - **Residual Connections**: Keep the information of input features by adding residual connections, such as the GCNII method. - **Utilization of Different Convolution Depths**: Fully utilize the advantages of deep - layer convolution by combining convolution features of different depths, such as the DAGNN method. ### Main Conclusions Through theoretical analysis and experimental verification, the paper reveals the essence of over - smoothing and proposes some effective alleviation methods. These methods not only help improve the performance of GCN, but also provide an important theoretical basis for future research.