Temporal Knowledge Graph Completion using Box Embeddings

Johannes Messner,Ralph Abboud,İsmail İlkan Ceylan
DOI: https://doi.org/10.48550/arXiv.2109.08970
2021-09-19
Abstract:Knowledge graph completion is the task of inferring missing facts based on existing data in a knowledge graph. Temporal knowledge graph completion (TKGC) is an extension of this task to temporal knowledge graphs, where each fact is additionally associated with a time stamp. Current approaches for TKGC primarily build on existing embedding models which are developed for (static) knowledge graph completion, and extend these models to incorporate time, where the idea is to learn latent representations for entities, relations, and timestamps and then use the learned representations to predict missing facts at various time steps. In this paper, we propose BoxTE, a box embedding model for TKGC, building on the static knowledge graph embedding model BoxE. We show that BoxTE is fully expressive, and possesses strong inductive capacity in the temporal setting. We then empirically evaluate our model and show that it achieves state-of-the-art results on several TKGC benchmarks.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **how to perform fact completion in the Temporal Knowledge Graph (TKG)**. Specifically, the paper proposes a new model, BoxTE (Box Embedding for Temporal Knowledge Graph Completion), aiming to improve the performance of the Temporal Knowledge Graph Completion task by introducing the embedding representation of timestamps. ### Problem Background The Temporal Knowledge Graph (TKG) is an extended knowledge graph, in which each fact contains not only entities and relations but also timestamp information. The task of Temporal Knowledge Graph Completion (TKGC) is to predict missing facts in a given Temporal Knowledge Graph. Existing TKGC methods are mainly based on static knowledge graph embedding models and handle time information by extending these models. However, these methods are usually unable to capture time - reasoning patterns well, especially when it comes to cross - time reasoning. ### Main Contributions of the Paper 1. **Proposing the BoxTE Model**: Based on the static knowledge graph embedding model BoxE, BoxTE flexibly represents time information by introducing special timestamp embeddings. 2. **Expressiveness and Inductive Ability**: The paper proves that BoxTE has full expressiveness and strong inductive ability in the time setting, and can capture multiple time - reasoning patterns, including rigid reasoning patterns and cross - time reasoning patterns. 3. **Experimental Verification**: The authors prove the effectiveness of BoxTE through experiments on multiple benchmark datasets, showing that it achieves state - of - the - art performance on multiple metrics. ### Formulas and Technical Details - **Entity and Relation Representations**: - Each entity \( h\in E \) is represented by two vectors: the basic position vector \( e_h\in\mathbb{R}^d \) and the translation vector \( b_h\in\mathbb{R}^d \). - For the binary fact \( r(h, t) \), the final head entity and tail entity representations are respectively: \[ e_r(h,t)_h = e_h + b_t,\quad e_r(h,t)_t = e_t + b_h \] - **Timestamp Representations**: - For each timestamp \( \tau\in T \), define a \( k\times d \) matrix \( K_\tau \), and a \( k \)-dimensional scalar vector \( \alpha_r \) for each relation \( r \). - For time and relation \( (r,\tau) \), the corresponding time translation is: \[ \tau_r=\alpha_rK_\tau \] - For the time fact \( r(h, t|\tau) \), the final entity representations are: \[ e_r(h,t|\tau)_h = e_h + b_t+\tau_r,\quad e_r(h,t|\tau)_t = e_t + b_h+\tau_r \] - **Scoring Function**: - The scoring function of BoxTE encourages entity representations to be located within the corresponding boxes: \[ \text{score}(r(h, t))=\|\delta(e_r(h,t)_h,r_h)\|_x+\|\delta(e_r(h,t)_t,r_t)\|_x \] - where \( \delta \) calculates the distance between a point and a box, and \( x \) represents the \( L_x \) norm. ### Experimental Results The paper conducted experiments on multiple benchmark datasets (such as ICEWS14, ICEWS5 - 15, and GDELT), and the results show that BoxTE has achieved excellent performance on multiple evaluation metrics (such as MR, MRR, Hits@1, Hits@3, Hits@10), especially significantly outperforming the existing ones on the GDELT dataset.