Self-Contrastive Graph Diffusion Network

Yixian Ma,Kun Zhan
2023-07-27
Abstract:Augmentation techniques and sampling strategies are crucial in contrastive learning, but in most existing works, augmentation techniques require careful design, and their sampling strategies can only capture a small amount of intrinsic supervision information. Additionally, the existing methods require complex designs to obtain two different representations of the data. To overcome these limitations, we propose a novel framework called the Self-Contrastive Graph Diffusion Network (SCGDN). Our framework consists of two main components: the Attentional Module (AttM) and the Diffusion Module (DiFM). AttM aggregates higher-order structure and feature information to get an excellent embedding, while DiFM balances the state of each node in the graph through Laplacian diffusion learning and allows the cooperative evolution of adjacency and feature information in the graph. Unlike existing methodologies, SCGDN is an augmentation-free approach that avoids "sampling bias" and semantic drift, without the need for pre-training. We conduct a high-quality sampling of samples based on structure and feature information. If two nodes are neighbors, they are considered positive samples of each other. If two disconnected nodes are also unrelated on $k$NN graph, they are considered negative samples for each other. The contrastive objective reasonably uses our proposed sampling strategies, and the redundancy reduction term minimizes redundant information in the embedding and can well retain more discriminative information. In this novel framework, the graph self-contrastive learning paradigm gives expression to a powerful force. SCGDN effectively balances between preserving high-order structure information and avoiding overfitting. The results manifest that SCGDN can consistently generate outperformance over both the contrastive methods and the classical methods.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address several key issues in the field of graph clustering: 1. **Issues with augmentation techniques and sampling strategies**: Existing contrastive learning methods often require complex augmentation techniques and carefully designed sampling strategies to obtain different representations of the data. However, these techniques usually require extensive parameter tuning in practical applications, and the sampling strategies can only capture a limited amount of intrinsic supervision information. 2. **Issues with two-view representations**: Existing methods typically require obtaining two different representations of the data, which increases the complexity of model design. To address these issues, the authors propose a new framework—Self-Contrastive Graph Diffusion Network (SCGDN). SCGDN consists of two main components: the Attention Module (AttM) and the Diffusion Module (DiFM). Specifically: - **Attention Module (AttM)**: Aggregates high-order structural and feature information of nodes to generate high-quality embedding representations. - **Diffusion Module (DiFM)**: Balances the state of each node in the graph through Laplacian diffusion learning and allows the adjacency and feature information in the graph to co-evolve. Compared to existing methods, SCGDN effectively avoids "sampling bias" and semantic drift without any data augmentation techniques or pre-trained models. Additionally, SCGDN introduces a novel negative sample sampling strategy that utilizes the structure and feature information of the graph to generate high-quality positive and negative sample pairs. Experimental results show that SCGDN outperforms existing contrastive learning methods and classical methods on multiple benchmark datasets.