RHCO: A Relation-aware Heterogeneous Graph Neural Network with Contrastive Learning for Large-scale Graphs

Ziming Wan,Deqing Wang,Xuehua Ming,Fuzhen Zhuang,Chenguang Du,Ting Jiang,Zhengyang Zhao
DOI: https://doi.org/10.48550/arXiv.2211.11752
2022-11-20
Abstract:Heterogeneous graph neural networks (HGNNs) have been widely applied in heterogeneous information network tasks, while most HGNNs suffer from poor scalability or weak representation when they are applied to large-scale heterogeneous graphs. To address these problems, we propose a novel Relation-aware Heterogeneous Graph Neural Network with Contrastive Learning (RHCO) for large-scale heterogeneous graph representation learning. Unlike traditional heterogeneous graph neural networks, we adopt the contrastive learning mechanism to deal with the complex heterogeneity of large-scale heterogeneous graphs. We first learn relation-aware node embeddings under the network schema view. Then we propose a novel positive sample selection strategy to choose meaningful positive samples. After learning node embeddings under the positive sample graph view, we perform a cross-view contrastive learning to obtain the final node representations. Moreover, we adopt the label smoothing technique to boost the performance of RHCO. Extensive experiments on three large-scale academic heterogeneous graph datasets show that RHCO achieves best performance over the state-of-the-art models.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two main problems existing in current Heterogeneous Graph Neural Networks (HGNNs) in large - scale heterogeneous graphs: **poor scalability** and **weak representational ability**. Specifically: 1. **Poor scalability**: - Most existing HGNNs, when applied to large - scale heterogeneous graphs, require pre - defining metapaths, which causes the number of neighbor nodes to grow exponentially with the number of nodes, making these models unable to be effectively applied to large - scale datasets. 2. **Weak representational ability**: - Existing HGNNs are usually trained under one view, which results in their inability to fully utilize the information in heterogeneous graphs and thus makes it difficult to learn efficient node representations. To solve these problems, the authors propose a new model named **RHCO** (Relation - aware Heterogeneous Graph Neural Network with Contrastive Learning). By introducing the contrastive learning mechanism and cross - view learning strategy, RHCO can better handle large - scale heterogeneous graphs and has achieved better performance than existing methods on multiple large - scale academic heterogeneous graph datasets. ### Main contributions of RHCO 1. **Introducing relation - aware contrastive learning mechanism**: - RHCO is the first research to attempt to apply contrastive learning to large - scale heterogeneous graphs. Through cross - view contrastive learning, RHCO can capture multiple relation - specific representations in heterogeneous graphs and use the contrastive learning mechanism to improve the quality of node representations. 2. **Proposing an attention - weight - based positive sample selection strategy**: - This strategy not only avoids explicitly constructing metapath - based neighbor graphs but also significantly improves the scalability of large - scale heterogeneous graph neural networks. It selects positive samples through the attention weights calculated by the pre - training model, thus reducing the consumption of computing resources while ensuring performance. 3. **Extensive experimental verification**: - The authors conducted extensive experiments on three large - scale public datasets. The results show that RHCO outperforms the existing state - of - the - art models in node classification tasks, proving its efficiency and superiority on large - scale heterogeneous graphs. Through these innovations, RHCO not only solves the limitations of existing HGNNs on large - scale heterogeneous graphs but also provides new ideas and directions for future research.