From Discrimination to Generation: Knowledge Graph Completion with Generative Transformer

Xin Xie,Ningyu Zhang,Zhoubo Li,Shumin Deng,Hui Chen,Feiyu Xiong,Mosha Chen,Huajun Chen
DOI: https://doi.org/10.1145/3487553.3524238
2023-03-14
Abstract:Knowledge graph completion aims to address the problem of extending a KG with missing triples. In this paper, we provide an approach GenKGC, which converts knowledge graph completion to sequence-to-sequence generation task with the pre-trained language model. We further introduce relation-guided demonstration and entity-aware hierarchical decoding for better representation learning and fast inference. Experimental results on three datasets show that our approach can obtain better or comparable performance than baselines and achieve faster inference speed compared with previous methods with pre-trained language models. We also release a new large-scale Chinese knowledge graph dataset AliopenKG500 for research purpose. Code and datasets are available in <a class="link-external link-https" href="https://github.com/zjunlp/PromptKG/tree/main/GenKGC" rel="external noopener nofollow">this https URL</a>.
Computation and Language,Artificial Intelligence,Databases,Information Retrieval,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the link prediction task in Knowledge Graph Completion (KGC). Specifically, the authors propose a new method - GenKGC, which aims to improve the efficiency and accuracy of predicting missing triples by transforming knowledge graph completion into a sequence - to - sequence generation task. Traditional KGC methods are usually based on knowledge embedding techniques. These methods need to score all possible triples, which is very time - consuming and computationally expensive on large - scale datasets. In addition, these methods also face the problem of unstable negative samples during the inference process. Therefore, the main goal of the paper is to develop a method that can reduce the inference time while maintaining performance, and improve the representation learning of entities and relations by introducing relation - guided demonstration and entity - aware hierarchical decoding, and reduce the time complexity of generation. The experimental results show that, compared with the existing benchmark models, GenKGC can achieve better or comparable performance on multiple datasets and achieve a faster inference speed.