DERA: Dense Entity Retrieval for Entity Alignment in Knowledge Graphs

Zhichun Wang,Xuan Chen
2024-08-02
Abstract:Entity Alignment (EA) aims to match equivalent entities in different Knowledge Graphs (KGs), which is essential for knowledge fusion and integration. Recently, embedding-based EA has attracted significant attention and many approaches have been proposed. Early approaches primarily focus on learning entity embeddings from the structural features of KGs, defined by relation triples. Later methods incorporated entities' names and attributes as auxiliary information to enhance embeddings for EA. However, these approaches often used different techniques to encode structural and attribute information, limiting their interaction and mutual enhancement. In this work, we propose a dense entity retrieval framework for EA, leveraging language models to uniformly encode various features of entities and facilitate nearest entity search across KGs. Alignment candidates are first generated through entity retrieval, which are subsequently reranked to determine the final alignments. We conduct comprehensive experiments on both cross-lingual and monolingual EA datasets, demonstrating that our approach achieves state-of-the-art performance compared to existing EA methods.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper aims to address the problem of Entity Alignment (EA) in Knowledge Graphs (KGs). The goal of entity alignment is to find equivalent entities in different knowledge graphs, which is crucial for knowledge fusion and integration. The paper points out that existing entity alignment methods typically use different techniques to encode structural information (such as relational triples) and attribute information (such as entity names, attribute values, etc.), which limits their interaction and mutual enhancement effects. Therefore, the paper proposes a framework based on Dense Entity Retrieval (DERA), which uses language models to uniformly encode various features of entities and facilitates the search for nearest entities across knowledge graphs. Specifically, the main contributions of DERA include: 1. Proposing a method to formalize the entity alignment problem as an entity retrieval task and constructing a language model-based framework to perform this task. In this framework, entity information is converted into textual descriptions and then encoded by the language model to achieve similarity calculation between entities. 2. Designing an Entity Verbalization Model to generate homogeneous textual descriptions of entities, enabling unified representation of entity information from different sources and languages. 3. Developing an Entity Retrieval and Alignment Reranking model, where the entity retrieval model independently encodes entities to efficiently find candidate alignments, and the alignment reranking model considers the interaction of entity pairs to further improve the accuracy of alignment results. The experimental section demonstrates the performance of DERA on multiple datasets, including cross-lingual and monolingual datasets, proving its significant advantages over existing methods. Particularly, in tasks with high heterogeneity and language differences, DERA shows stronger robustness and effectiveness.