GTC: GNN-Transformer Co-contrastive Learning for Self-supervised Heterogeneous Graph Representation

Yundong Sun,Dongjie Zhu,Yansong Wang,Zhaoshuo Tian
2024-03-22
Abstract:Graph Neural Networks (GNNs) have emerged as the most powerful weapon for various graph tasks due to the message-passing mechanism's great local information aggregation ability. However, over-smoothing has always hindered GNNs from going deeper and capturing multi-hop neighbors. Unlike GNNs, Transformers can model global information and multi-hop interactions via multi-head self-attention and a proper Transformer structure can show more immunity to the over-smoothing problem. So, can we propose a novel framework to combine GNN and Transformer, integrating both GNN's local information aggregation and Transformer's global information modeling ability to eliminate the over-smoothing problem? To realize this, this paper proposes a collaborative learning scheme for GNN-Transformer and constructs GTC architecture. GTC leverages the GNN and Transformer branch to encode node information from different views respectively, and establishes contrastive learning tasks based on the encoded cross-view information to realize self-supervised heterogeneous graph representation. For the Transformer branch, we propose Metapath-aware Hop2Token and CG-Hetphormer, which can cooperate with GNN to attentively encode neighborhood information from different levels. As far as we know, this is the first attempt in the field of graph representation learning to utilize both GNN and Transformer to collaboratively capture different view information and conduct cross-view contrastive learning. The experiments on real datasets show that GTC exhibits superior performance compared with state-of-the-art methods. Codes can be available at
Machine Learning,Information Retrieval
What problem does this paper attempt to address?
The paper aims to address the over-smoothing problem encountered by Graph Neural Networks (GNNs) when capturing multi-hop neighbor information and to achieve self-supervised heterogeneous graph representation learning by combining the advantages of GNNs and Transformers. Specifically, the paper proposes the following key issues: 1. **How to combine GNN and Transformer**: By using GNN and Transformer as two branches to encode local views and global views (or graph pattern views and hop views) respectively, leveraging GNN's local information aggregation capability and Transformer's global structure modeling capability to avoid the over-smoothing problem. 2. **Self-supervised heterogeneous graph representation learning**: Constructing a collaborative contrastive learning task based on different view representations to enhance the information fusion between the two views, achieving self-supervised heterogeneous graph representation learning. The paper proposes a new GNN-Transformer collaborative learning framework (referred to as GTC) to successfully address the above challenges. This framework not only effectively captures multi-hop neighbor information but also avoids the over-smoothing problem, thereby improving the model's expressive capability. Moreover, experimental results show that GTC outperforms existing state-of-the-art methods on real datasets.