Graph Structure Enhanced Pre-Training Language Model for Knowledge Graph Completion

Huashi Zhu,Dexuan Xu,Yu Huang,Zhi Jin,Weiping Ding,Jiahui Tong,Guoshuang Chong
DOI: https://doi.org/10.1109/tetci.2024.3372442
2024-01-01
IEEE Transactions on Emerging Topics in Computational Intelligence
Abstract:A vast amount of textual and structural information is required for knowledge graph construction and its downstream tasks. However, most of the current knowledge graphs are incomplete due to the difficulty of knowledge acquisition and integration. Knowledge Graph Completion (KGC) is used to predict missing connections. In previous studies, textual information and graph structural information are utilized independently, without an effective method for fusing these two types of information. In this paper, we propose a graph structure enhanced pre-training language model for knowledge graph completion. Firstly, we design a graph sampling algorithm and a Graph2Seq module for constructing sub-graphs and their corresponding contexts to support large-scale knowledge graph learning and parallel training. It is also the basis for fusing textual data and graph structure. Next, two pre-training tasks based on masked modeling are designed for capturing accurate entity-level and relation-level information. Furthermore, this paper proposes a novel asymmetric Encoder-Decoder architecture to restore masked components, where the encoder is a Pre-trained Language Model (PLM) and the decoder is a multi-relational Graph Neural Network (GNN). The purpose of the architecture is to integrate textual information effectively with graph structural information. Finally, the model is fine-tuned for KGC tasks on two widely used public datasets. The experiments show that the model achieves excellent performance and outperforms baselines in most metrics, which demonstrate the effectiveness of our approach by fusing the structure and semantic information to knowledge graph.
What problem does this paper attempt to address?