GPPT: Graph Pre-training and Prompt Tuning to Generalize Graph Neural Networks
Mingchen Sun,Kaixiong Zhou,Xin He,Ying Wang,Xin Wang
DOI: https://doi.org/10.1145/3534678.3539249
2022-01-01
Abstract:Despite the promising representation learning of graph neural networks (GNNs), the supervised training of GNNs notoriously requires large amounts of labeled data from each application. An effective solution is to apply the transfer learning in graph: using easily accessible information to pre-train GNNs, and fine-tuning them to optimize the downstream task with only a few labels. Recently, many efforts have been paid to design the self-supervised pretext tasks, and encode the universal graph knowledge among the various applications. However, they rarely notice the inherent training objective gap between the pretext and downstream tasks. This significant gap often requires costly fine-tuning for adapting the pre-trained model to downstream problem, which prevents the efficient elicitation of pre-trained knowledge and then results in poor results. Even worse, the naive pre-training strategy usually deteriorates the downstream task, and damages the reliability of transfer learning in graph data. To bridge the task gap, we propose a novel transfer learning paradigm to generalize GNNs, namely graph pre-training and prompt tuning (GPPT). Specifically, we first adopt the masked edge prediction, the most simplest and popular pretext task, to pre-train GNNs. Based on the pre-trained model, we propose the graph prompting function to modify the standalone node into a token pair, and reformulate the downstream node classification looking the same as edge prediction. The token pair is consisted of candidate label class and node entity. Therefore, the pre-trained GNNs could be applied without tedious fine-tuning to evaluate the linking probability of token pair, and produce the node classification decision. The extensive experiments on eight benchmark datasets demonstrate the superiority of GPPT, delivering an average improvement of 4.29% in few-shot graph analysis and accelerating the model convergence up to 4.32X. The code is available in: https://github.com/MingChen-Sun/GPPT.