Abstract:In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. However, scaling them to large graphs is challenging due to the high computational and storage costs of repeated feature propagation and non-linear transformation during training. One commonly employed approach to address this challenge is model-simplification, which only executes the Propagation (P) once in the pre-processing, and Combine (C) these receptive fields in different ways and then feed them into a simple model for better performance. Despite their high predictive performance and scalability, these methods still face two limitations. First, existing approaches mainly focus on exploring different C methods from the model perspective, neglecting the crucial problem of performance degradation with increasing P depth from the data-centric perspective, known as the over-smoothing problem. Second, pre-processing overhead takes up most of the end-to-end processing time, especially for large-scale graphs. To address these limitations, we present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works. This module enables the exploration of deeper GNNs while preserving their scalability. Unlike the previous model-simplification works, we focus on continuous P and found that the noise existing inside each P is the cause of the over-smoothing issue, and use the efficient masking mechanism to eliminate them. Experimental results on six real-world datasets demonstrate that model-simplification works equipped with RMask yield superior performance compared to their original version and can make a good trade-off between accuracy and efficiency.

What problem does this paper attempt to address?

This paper attempts to solve the scalability problem of graph neural networks (GNNs) on large - scale graph data. Specifically: 1. **Over - smoothing Problem**: - In the existing model - simplified GNNs, as the propagation depth increases, the node representations become indistinguishable, leading to a performance decline. This phenomenon is called the "over - smoothing problem". The existing methods mainly focus on designing different combination methods (C) while ignoring the noise information introduced during the propagation process. - The author finds that the noise information introduced in each propagation operation (P) is a key factor causing the over - smoothing problem. 2. **High Pre - processing Overhead**: - The existing model - simplified GNNs improve efficiency by placing the expensive feature propagation step in the pre - processing stage, but this results in the pre - processing time taking up most of the entire training time, especially on large - scale graphs. - This pre - processing method depends on the information cross - correlation between different propagation steps and can only be carried out sequentially, further increasing the computational complexity. To solve these problems, the author proposes the Random Walk and Noise Masking module (RMask), which has the following features: - **Noise Masking Mechanism**: By identifying and masking redundant information in each propagation step, pure high - order information is extracted, thereby alleviating the over - smoothing problem. - **Efficient Random Walk**: Use random walks to capture truly useful information in a parallel manner, reducing pre - processing overhead. - **Sparse Graph Generation**: The noise masking mechanism generates sparse graphs, further reducing the overhead of aggregation calculations. The experimental results show that the model - simplified GNN equipped with RMask performs well on multiple datasets, can achieve a good balance between accuracy and efficiency, and can use deep - level information more effectively to improve prediction performance. ### Formula Summary - **Node Smoothness Level (NSL)**: \[ \text{NSL}_i=\frac{1}{N - 1}\sum_{j\in V,j\neq i}\frac{X_i\cdot X_j}{|X_i||X_j|} \] - **Graph Smoothness Level (GSL)**: \[ \text{GSL}=\frac{1}{N}\sum_{i\in V}\text{NSL}_i \] - **De - noise Matrix**: \[ M^h_i=\left\{m_{ij}\mid m_{ij}=\begin{cases}1&\text{if distance}(v_i,v_j) = h\\0&\text{if distance}(v_i,v_j)<h\end{cases}\right\} \] - **Personalized PageRank**: \[ S=\alpha(I-(1 - \alpha)\hat{A})^{-1} \] These formulas and mechanisms work together to enable RMask to effectively solve the over - smoothing problem and high pre - processing overhead problem in the existing model - simplified GNNs.

Towards Scalable and Deep Graph Neural Networks via Noise Masking

Towards Robust Graph Neural Networks against Label Noise

GNN Cleaner: Label Cleaner for Graph Structured Data

Scaling Up Graph Neural Networks Via Graph Coarsening

Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptive Residual Module

Deep Graph Neural Networks via Flexible Subgraph Aggregation

Graph Representation Learning on Noise and Sparse Labels.

Model Degradation Hinders Deep Graph Neural Networks

DEGNN: Dual Experts Graph Neural Network Handling Both Edge and Node Feature Noise

Decoupling the Depth and Scope of Graph Neural Networks

Limiting Over-Smoothing and Over-Squashing of Graph Message Passing by Deep Scattering Transforms

Graph Neural Networks Inspired by Classical Iterative Algorithms

Towards Robust Graph Neural Networks for Noisy Graphs with Sparse Labels

The Snowflake Hypothesis: Training and Powering GNN with One Node One Receptive Field

Efficient Model-Based OPC via Graph Neural Network

Multicoated and Folded Graph Neural Networks with Strong Lottery Tickets

Towards Efficient Point Cloud Graph Neural Networks Through Architectural Simplification

Blocking-based Neighbor Sampling for Large-scale Graph Neural Networks.

Evaluating Deep Graph Neural Networks

How Powerful is Implicit Denoising in Graph Neural Networks

A Survey of Graph Neural Networks in Real world: Imbalance, Noise, Privacy and OOD Challenges