TE-TFN: A Text-enhanced Transformer Fusion Network for Multimodal Knowledge Graph Completion

Jingchao Wang,Xiao Liu,Weimin Li,Fangfang Liu,Xing Wu,Qun Jin
DOI: https://doi.org/10.1109/mis.2024.3378921
IF: 6.744
2024-01-01
IEEE Intelligent Systems
Abstract:Multimodal knowledge graphs (MKGs) organize multimodal facts in the form of entities and relations, and have been successfully applied to several downstream tasks. Since most MKGs are incomplete, the MKG completion (MKGC) task has been proposed to address this problem, which aims to complete missing entities in MKGs. Previous most works obtain reasoning ability by capturing the correlation between target triplets and related images, but they ignore contextual semantic information and the reasoning process is not easily explainable. To address these issues, we propose a novel text-enhanced transformer fusion network called TE-TFN, which converts the context path between head and tail entities into natural language text and fuses multimodal features from both coarse and fine granularities through a multi-granularity fuser. It not only effectively enhances text semantic information, but also improves the interpretability of the model by introducing paths. Experimental results on benchmark datasets demonstrate the effectiveness of our model.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?