Abstract:A variety of knowledge graph embedding approaches have been developed. Most of them obtain embeddings by learning the structure of the knowledge graph within a link prediction setting. As a result, the embeddings reflect only the structure of a single knowledge graph, and embeddings for different knowledge graphs are not aligned, e.g., they cannot be used to find similar entities across knowledge graphs via nearest neighbor search. However, knowledge graph embedding applications such as entity disambiguation require a more global representation, i.e., a representation that is valid across multiple sources. We propose to learn universal knowledge graph embeddings from large-scale interlinked knowledge sources. To this end, we fuse large knowledge graphs based on the owl:sameAs relation such that every entity is represented by a unique identity. We instantiate our idea by computing universal embeddings based on DBpedia and Wikidata yielding embeddings for about 180 million entities, 15 thousand relations, and 1.2 billion triples. We believe our computed embeddings will support the emerging field of graph foundation models. Moreover, we develop a convenient API to provide embeddings as a service. Experiments on link prediction suggest that universal knowledge graph embeddings encode better semantics compared to embeddings computed on a single knowledge graph. For reproducibility purposes, we provide our source code and datasets open access.

What problem does this paper attempt to address?

This paper mainly discusses how to solve the problem of incompatibility between different knowledge graphs in the Knowledge Graph Embedding (KGE) models. Existing KGE methods usually only focus on the structure of a single knowledge graph, but in practical applications, it is necessary to integrate global representations of multiple sources of information. To this end, the authors propose a method to learn universal knowledge graph embeddings by fusing knowledge graphs from large-scale interlinked knowledge sources and mapping all entities to unique IDs through the "owl:sameAs" relationship, creating a unified embedding space. In the implementation, they merge multiple knowledge graphs into a single knowledge graph and assign unique IDs to each matched entity, thereby reducing memory consumption and computational costs, as well as addressing the issue of knowledge graph incompleteness. The authors evaluate this method using four different KGE models (DistMult, ComplEx, QM ult, and ConEx), and the results show that universal knowledge graph embeddings perform better than embeddings of individual knowledge graphs, particularly in the ConEx model, in the link prediction task. In addition, the authors have developed a convenient API to provide these embeddings as a service and have open-sourced their code and datasets to support reproducible research. The paper also mentions some related work, including multilingual knowledge graph embedding, bootstrapping strategies based on matching scores, and methods utilizing additional entity attributes. In summary, this paper attempts to address the problem of how to create and learn universal entity embeddings that can integrate multiple knowledge graph information, in order to improve the performance and practicality of knowledge graph applications.

Universal Knowledge Graph Embeddings

Knowledge Graph Embedding: An Overview

Knowledge Graph Embedding with Diversity of Structures

A Survey on Knowledge Graph Embedding: Approaches, Applications and Benchmarks

A Physical Embedding Model for Knowledge Graphs

Expeditious Generation of Knowledge Graph Embeddings

Multi-source Knowledge Embedding Research of Knowledge Graph

Knowledge Graph Embeddings in the Biomedical Domain: Are They Useful? A Look at Link Prediction, Rule Learning, and Downstream Polypharmacy Tasks

Knowledge Graph Embedding: A Survey of Approaches and Applications

Universal Representation Learning of Knowledge Bases by Jointly Embedding Instances and Ontological Concepts

Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings

Probabilistic Belief Embedding for Large-Scale Knowledge Population

CogKGE: A Knowledge Graph Embedding Toolkit and Benchmark for Representing Multi-source and Heterogeneous Knowledge

Geometry Interaction Knowledge Graph Embeddings

Differentially Private Federated Knowledge Graphs Embedding

Dual Graph Embedding for Object-Tag Link Prediction on the Knowledge Graph.

Exploring the Generalization of Knowledge Graph Embedding.

Knowledge Graph Embeddings: A Comprehensive Survey on Capturing Relation Properties

Large-scale knowledge graph representation learning

$μ\text{KG}$: A Library for Multi-source Knowledge Graph Embeddings and Applications