Integration of Semantic and Topological Structural Similarity Comparison for Entity Alignment without Pre-Training

Yao Liu,Ye Liu
DOI: https://doi.org/10.3390/electronics13112036
IF: 2.9
2024-05-24
Electronics
Abstract:Entity alignment (EA) is a critical task in integrating diverse knowledge graph (KG) data and plays a central role in data-driven AI applications. Traditional EA approaches rely on entity embeddings, but their effectiveness is limited by scarce KG input data and representation learning techniques. Large language models have shown promise, but face challenges such as high hardware requirements, large model sizes and computational inefficiency, which limit their applicability. To overcome these limitations, we propose an entity-alignment model that compares the similarity between entities by capturing both semantic and topological information to enable the alignment of entities with high similarity. First, we analyze descriptive information to quantify semantic similarity, including individual features such as types and attributes. Then, for topological analysis, we introduce four conditions based on graph connectivity and structural patterns to determine subgraph similarity within three hops of the entity's neighborhood, thereby improving accuracy. Finally, we integrate semantic and topological similarity using a weighted approach that considers dataset features. Our model requires no pre-training and is designed to be compact and generalizable to different datasets. Experimental results on four standard EA datasets validate the effectiveness of our proposed model.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The paper aims to address the key issues in the task of Entity Alignment (EA). Specifically: 1. **Limitations of Existing Methods**: Traditional EA methods rely on entity embeddings, but their effectiveness is constrained due to the scarcity of input data in Knowledge Graphs (KG) and the limitations of representation learning techniques. Additionally, methods based on Large Language Models (LLMs) perform well but face challenges such as high hardware requirements, large model sizes, and low computational efficiency. 2. **Proposed New Method**: To overcome the above limitations, the authors propose an entity alignment model that does not require pre-training, achieving entity alignment by combining semantic similarity and topological structure similarity. First, semantic similarity is quantified using descriptive information; then, subgraph similarity is determined by analyzing the connections and structural patterns between nodes; finally, semantic and topological similarities are fused to improve accuracy. 3. **Experimental Validation**: Experiments on four standard EA datasets demonstrate the effectiveness of the proposed model and show its advantages over existing methods, particularly in handling datasets with time-series characteristics. In summary, this research is dedicated to developing an efficient and general entity alignment solution that performs well across different types of KG datasets.