Abstract:Joint representation learning over multi-sourced knowledge graphs (KGs) yields transferable and expressive embeddings that improve downstream tasks. Entity alignment (EA) is a critical step in this process. Despite recent considerable research progress in embedding-based EA, how it works remains to be explored. In this paper, we provide a similarity flooding perspective to explain existing translation-based and aggregation-based EA models. We prove that the embedding learning process of these models actually seeks a fixpoint of pairwise similarities between entities. We also provide experimental evidence to support our theoretical analysis. We propose two simple but effective methods inspired by the fixpoint computation in similarity flooding, and demonstrate their effectiveness on benchmark datasets. Our work bridges the gap between recent embedding-based models and the conventional similarity flooding algorithm. It would improve our understanding of and increase our faith in embedding-based EA.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are: in multi - source knowledge graphs (KGs), how to achieve alignment in entity embedding representation learning and why these embeddings are effective. Specifically, the paper focuses on the working principles of embedding - based entity alignment (EA) models, that is, how these models generate similar entity embeddings so that the same entities in different KGs can be successfully aligned. ### Core Problems of the Paper 1. **Reasons for Entity Embedding Similarity**: - Although existing embedding - based EA techniques have made significant progress, a key question remains unanswered: **What factors make entity embeddings similar in EA models?** 2. **Connection between Theory and Traditional Methods**: - The connection between existing embedding - based EA models and traditional symbolic methods has not been fully explored. ### Solutions To solve the above problems, the paper introduces the perspective of **Similarity Flooding (SF)** to explain and improve embedding - based EA models. SF is an algorithm widely used in structured data matching, and its core idea is to propagate similarity by iteratively calculating fixed points. The paper proves that existing translation - and aggregation - based EA models are actually looking for fixed points of similarity between entity pairs. ### Main Contributions 1. **Theoretical Analysis**: - Provides the first theoretical analysis of embedding - based EA techniques, revealing the working mechanisms of these models. - Unifies basic translation - and aggregation - type EA models from the perspective of similarity flooding. - Establishes a close connection between embedding - based and traditional symbolic methods through the unified fixed - point calculation perspective. 2. **Proposing New Methods**: - Proposes two simple but effective methods to improve EA: - **Similarity - Flooding - Based Variant**: Calculate the fixed point of similarity from entity combinations induced by TransE or GCN without learning KG embeddings. - **Self - Propagating Connection**: Introduce self - propagating connections in neighborhood aggregation, giving entity embeddings the opportunity to propagate back to themselves, thereby improving the alignment effect. 3. **Experimental Verification**: - Conducts experiments on benchmark datasets such as DBP15K and OpenEA, verifies the effectiveness of the proposed methods, and provides experimental evidence to support the theoretical conclusions. ### Formula Summary - **Similarity Matrix**: \[ \Omega=(x_1; x_2; \ldots; x_n)^\top(y_1; y_2; \ldots; y_m)\in\mathbb{R}^{n\times m} \] - **Fixed - Point Formula**: \[ \Omega = \text{normalize}\left(\Omega_0+\Lambda\Omega(\Lambda')^\top\right) \] - **Self - Propagating Aggregation Function**: \[ e_{i + 1}=(1-\alpha)\oplus_{z\in N(e)}(z)+\alpha f(e_i) \] Through these methods, the paper not only deepens our understanding of embedding - based EA techniques but also provides a new perspective to improve the performance of these models.

What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings

Understanding and Improving Knowledge Graph Embedding for Entity Alignment

Knowledge Graph Entity Alignment Using Relation Structural Similarity

Iterative Entity Alignment via Joint Knowledge Embeddings

Revisiting Embedding-based Entity Alignment: A Robust and Adaptive Method

Integration of Semantic and Topological Structural Similarity Comparison for Entity Alignment without Pre-Training

Exploiting Global Semantic Similarities in Knowledge Graphs by Relational Prototype Entities

ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities

EventEA: Benchmarking Entity Alignment for Event-centric Knowledge Graphs

Revisit and Outstrip Entity Alignment: A Perspective of Generative Models.

DERA: Dense Entity Retrieval for Entity Alignment in Knowledge Graphs

Do Similar Entities have Similar Embeddings?

Aligning Multiple Knowledge Graphs in a Single Pass

A benchmarking study of embedding-based entity alignment for knowledge graphs

Deep Reinforcement Learning for Entity Alignment

Cross-Lingual Entity Alignment via Joint Attribute-Preserving Embedding

Unifying Dual-Space Embedding for Entity Alignment via Contrastive Learning

Jointly Learning Entity and Relation Representations for Entity Alignment

Multi-view Knowledge Graph Embedding for Entity Alignment

Knowledge graph embedding with shared latent semantic units