StructComp: Substituting propagation with Structural Compression in Training Graph Contrastive Learning

Shengzhong Zhang,Wenjie Yang,Xinyuan Cao,Hongwei Zhang,Zengfeng Huang
2024-04-21
Abstract:Graph contrastive learning (GCL) has become a powerful tool for learning graph data, but its scalability remains a significant challenge. In this work, we propose a simple yet effective training framework called Structural Compression (StructComp) to address this issue. Inspired by a sparse low-rank approximation on the diffusion matrix, StructComp trains the encoder with the compressed nodes. This allows the encoder not to perform any message passing during the training stage, and significantly reduces the number of sample pairs in the contrastive loss. We theoretically prove that the original GCL loss can be approximated with the contrastive loss computed by StructComp. Moreover, StructComp can be regarded as an additional regularization term for GCL models, resulting in a more robust encoder. Empirical studies on various datasets show that StructComp greatly reduces the time and memory consumption while improving model performance compared to the vanilla GCL models and scalable training methods.
Machine Learning
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the scalability issues of Graph Contrastive Learning (GCL) on large-scale datasets. Specifically: 1. **Scalability Challenges of Graph Contrastive Learning**: - The number of nodes that need to be computed during the message-passing process grows exponentially. - Graph contrastive learning typically requires a large number of sample pairs, which can lead to computational and memory demands that are proportional to the square of the number of nodes. 2. **Proposed Solution**: - A new framework called "Structural Compression" (StructComp) is proposed, which replaces message passing with sparse low-rank approximations, thereby significantly reducing the message-passing requirements and the number of sample pairs during the training phase. - StructComp not only simplifies the training process but also introduces additional regularization terms, enhancing the robustness of the model. 3. **Theoretical Analysis**: - It is theoretically proven that the original graph contrastive loss can be approximated by the compressed contrastive loss, and StructComp can be viewed as an additional regularization term for the GCL model. 4. **Experimental Results**: - Experiments on multiple datasets demonstrate that StructComp significantly reduces time consumption and memory usage while improving model performance, especially on large-scale datasets.