Abstract:Graph neural networks (GNNs) are the dominant paradigm for modeling and handling graph structural data by learning universal node representation. The traditional way of training GNNs depends on a great many labeled data, which is time-consuming and money-consuming. In some special scenes, it is even unavailable and impracticable. Self-supervised representation learning, which can generate labels by graph structural data itself, is a potential approach to tackle this problem. And turning to research self-supervised learning problems for heterogeneous graphs is more challenging than dealing with homogeneous graphs, there are fewer studies about it as well. In this paper, we propose a SE lf-supervised learning method for heterogeneous graph via S tructure I nformation based on M etapath (SESIM). Firstly, the pseudo-labels are constructed to train pretext tasks, using data itself and avoiding time-consuming manual labeling. Afterward, we use traditional graph neural networks to aggregate node features, obtaining the node embeddings. And then, the primary task and pretext tasks are designed by these node embeddings. The pretext tasks, i.e., jump numbers prediction between nodes in each metapath, can improve the representation ability of the primary task. Moreover, predicting jump numbers in each metapath can effectively utilize graph structural information, which is the essential property of nodes. Therefore, SESIM deepens the understanding of models for graph structure. At last, we train the primary task and pretext tasks jointly and balance the contributions of pretext tasks for the primary task. The key advantage of our proposed model is that we research self-supervised learning for the heterogeneous graph to address the time-consuming and money-consuming problem of obtaining labels And we design a novel pretext task, i.e., jump numbers prediction in each metapath, via graph structural information based on the metapath. Empirical results validate the performance of the SESIM method and demonstrate that this method can improve the representation ability of traditional neural networks on link prediction tasks and node classification tasks.

Evaluating Self-Supervised Learning for Molecular Graph Embeddings

Homophily-Enhanced Self-Supervision for Graph Structure Learning: Insights and Directions.

Self-supervised Learning on Graphs: Contrastive, Generative,or Predictive

Automated Graph Self-supervised Learning via Multi-teacher Knowledge Distillation

Graph Self-Supervised Learning for Optoelectronic Properties of Organic Semiconductors

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction

A Knowledge-Driven Self-Supervised Approach for Molecular Generation

An effective self-supervised framework for learning expressive molecular global representations to drug discovery

Self-Supervised Graph Information Bottleneck for Multiview Molecular Embedding Learning

Enhancing Graph Self-Supervised Learning with Graph Interplay

Improving Self-supervised Molecular Representation Learning using Persistent Homology

Automated Self-Supervised Learning for Graphs

Simple Self-supervised Multiplex Graph Representation Learning

GAN-based self-supervised message passing graph representation learning

Pre-training Molecular Graph Representation with 3D Geometry

Self-Supervised Learning of Graph Neural Networks: A Unified Review

Molecular Graph Representation Learning via Structural Similarity Information

Hierarchical Molecular Graph Self-Supervised Learning for property prediction

Graph-based Molecular Representation Learning

Self-supervised learning for heterogeneous graph via structure information based on metapath