Abstract:Knowledge graphs (KGs) are collections of real-world knowledge that is represented by a structured form of triples. Since they are manually built in their nascent stage, there is a common problem that some links (triples) are missing. Knowledge graph completion (KGC) aims to find those missing links and thereby complete the KGs. However, as knowledge increases through diverse sources, new entities have explosively emerged and they are needed to be connected to existing KGs. Thus, open-world KGC is targeted on extending KGs to those new entities. Dealing with those new entities is challenging because they do not have any connection with entities in the existing KGs. One way to handle the new ones is to embed them with their textual descriptions with pre-trained word embeddings and score them in the graph-vector space with the existing typical KGC models. These models have resulted in meaningful results but there is still a lack of studies on utilizing the latest neural networks, such as pre-trained language models which are known to be better at capturing contexts than pre-trained word embeddings. This paper proposes a novel model that effectively connects new entities and existing KGs through a pre-trained language model. To effectively handle the problem, we utilize two learning methods; one is the classification method of the masked language model (MLM) that predicts a word among a huge vocabulary set with a given context, and the other is multi-task learning based on the Multi-Task for Deep Neural Networks (MT-DNN). Based on the methods, the model first generates an embedding of a new entity using its textual description and then uses the embedding to find one of the existing entities from a KG where the new entity can be connected. The experimental results on three benchmark datasets, DBPedia50k, FB15k-237-OWE, and FB20k, show that the proposed model improves performances by 9.2%p , 4.4%p , and 11.1%p , respectively, and achieves new state-of-the-art performance for all datasets.

BertNet: Harvesting Knowledge Graphs from Pretrained Language Models

K-BERT: Enabling Language Representation with Knowledge Graph

Language Models are Open Knowledge Graphs

Generating Knowledge Graphs from Large Language Models: A Comparative Study of GPT-4, LLaMA 2, and BERT

LambdaKG: A Library for Pre-trained Language Model-Based Knowledge Graph Embeddings

NeuralKG: an Open Source Library for Diverse Representation Learning of Knowledge Graphs

Structure Pre-training and Prompt Tuning for Knowledge Graph Transfer

Interpreting Language Models Through Knowledge Graph Extraction

KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs

SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graphs

Knowledge graph extension with a pre-trained language model via unified learning method

KGNER: Improving Chinese Named Entity Recognition by BERT Infused with the Knowledge Graph

An empirical study of pre-trained language models in simple knowledge graph question answering

MEGA: Meta-Graph Augmented Pre-Training Model for Knowledge Graph Completion

Graph Neural Prompting with Large Language Models

KG-BERT: BERT for Knowledge Graph Completion

OAG-BERT: Towards A Unified Backbone Language Model For Academic Knowledge Services

Large Knowledge Model: Perspectives and Challenges

AutoKG: Efficient Automated Knowledge Graph Generation for Language Models

Enhancing Pre-Trained Language Representations with Rich Knowledge for Machine Reading Comprehension