Abstract:Graph Neural Networks (GNNs) have been widely used to learn node representations and with outstanding performance on various tasks such as node classification. However, noise, which inevitably exists in real-world graph data, would considerably degrade the performance of GNNs as the noise is easily propagated via the graph structure. In this work, we propose a novel and robust method, Bayesian Robust Graph Contrastive Learning (BRGCL), which trains a GNN encoder to learn robust node representations. The BRGCL encoder is a completely unsupervised encoder. Two steps are iteratively executed at each epoch of training the BRGCL encoder: (1) estimating confident nodes and computing robust cluster prototypes of node representations through a novel Bayesian nonparametric method; (2) prototypical contrastive learning between the node representations and the robust cluster prototypes. Experiments on public and large-scale benchmarks demonstrate the superior performance of BRGCL and the robustness of the learned node representations. The code of BRGCL is available at \url{<a class="link-external link-https" href="https://github.com/BRGCL-code/BRGCL-code" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: how to train graph neural networks (GNNs) in graph data with noise to obtain node representations that are robust to noise. Specifically, the noise in graph data may exist in node attributes or node labels, which will significantly degrade the performance of GNNs. Existing GNN methods perform poorly when dealing with noisy data because noise can propagate through the graph structure and affect the learning effect of other nodes. Therefore, the authors propose a new method - Bayesian Robust Graph Contrastive Learning (BRGCL), aiming to improve the robustness of GNNs on noisy data. ### Problem Background 1. **The Influence of Noise**: - The noise in graph data is mainly divided into two categories: attribute noise and label noise. These noises will lead to the decline of GNN performance. - Noise can propagate through the topological structure of the graph, further affecting the representation learning of other nodes. 2. **Limitations of Existing Methods**: - Although manual cleaning and labeling of data can reduce the impact of noise, it is costly and difficult to scale, and cannot handle large - scale online noisy data. - Most of the existing GNN methods do not consider the noise problem in the input graph, resulting in poor performance in practical applications. ### The Goals of BRGCL The main goals of BRGCL to improve the robustness of GNNs on noisy data are as follows: - **Completely Unsupervised**: BRGCL does not require any prior knowledge of true labels or categories, and only depends on the input node attributes for training. - **Utilizing Confident Nodes**: BRGCL identifies those nodes that are more confident about their category labels through a new algorithm called Bayesian nonparametric Estimation of Confidence (BEC), and uses these nodes to guide model training. - **Contrastive Learning**: BRGCL adopts a contrastive learning framework and learns robust node representations by maximizing the mutual information between different views. ### Key Points of the Solution 1. **BEC Algorithm**: - The BEC algorithm is used to estimate confident nodes and their prototype representations. Confident nodes refer to those nodes that are far from the category boundaries and are not easily affected by noise. - Through the Bayesian nonparametric method, BEC can infer pseudo - labels without true labels and estimate confident nodes based on these pseudo - labels. 2. **Contrastive Learning Framework**: - BRGCL uses a contrastive learning framework. By generating two different graph views and maximizing the consistency between these two views, it learns robust node representations. - At the same time, BRGCL also adopts prototype - based contrastive learning and further improves robustness by maximizing the mutual information between node representations and robust prototypes. 3. **Decoupled Training**: - In order to reduce the impact of noise on the classifier, BRGCL decouples node representation learning from the classification task. First, train the BRGCL encoder to obtain robust node representations, and then train the classifier on these representations. Through the above methods, BRGCL can show better performance than existing methods on noisy data, and its robustness to noise has been verified in experiments.

Bayesian Robust Graph Contrastive Learning

G-Censor: Graph Contrastive Learning with Task-Oriented Counterfactual Views

Debiased Graph Contrastive Learning.

Low-Rank Graph Contrastive Learning for Node Classification

Learning Robust Node Representations on Graphs.

Certifiably Robust Graph Contrastive Learning

Learning Robust Representation through Graph Adversarial Contrastive Learning

Contrastive Graph Representation Learning with Adversarial Cross-view Reconstruction and Information Bottleneck

Contrastive Message Passing for Robust Graph Neural Networks with Sparse Labels

Neighbor Contrastive Learning on Learnable Graph Augmentation

Enhancing Graph Contrastive Learning with Node Similarity

Contrastive learning of graphs under label noise

Graph Contrastive Learning with Generative Adversarial Network

Adversarial Graph Augmentation to Improve Graph Contrastive Learning

Graph Contrastive Learning with Augmentations

Supervised contrastive learning for graph representation enhancement

Robust Hypergraph-Augmented Graph Contrastive Learning for Graph Self-Supervised Learning

Similarity Preserving Adversarial Graph Contrastive Learning

Towards Effective and Robust Graph Contrastive Learning with Graph Autoencoding

Learning on Graphs under Label Noise