Is Multi-Level Data Enhancement Helpful for Knowledge Graph? A New Perspective on Multimodal Fusion

Kang Yang,Ruiyun Yu,Bingyang Guo,Shi Zhen
DOI: https://doi.org/10.1016/j.knosys.2024.112285
IF: 8.139
2024-01-01
Knowledge-Based Systems
Abstract:In practical applications, knowledge graphs are typically associated with various types of multimodal data, including images. As representation learning methods based on unimodal knowledge graphs face performance challenges, there is a growing interest in multimodal knowledge graphs. However, the fusion of semantic information (the deeper meanings that lurk in context) and graph structure requires improvement. A few recent methods have combined image-semantic information only at the model level. Herein, we propose a model called cross-level multimodal semantic embedding learning (CLMSE). Initially, CLMSE leverages semantic similarities among images to enhance the data distribution of knowledge graphs. Subsequently, it utilizes the pseudoconnections produced to enhance its predictive performance. CLMSE extracts semantic data on the graph structure across varying neighborhood ranges using a higher-order and second-order decoder. The experiments show that CLMSE surpasses other multimodal knowledge graph embedding learning models in predictive performance, achieving state-of-the-art results. In a more in-depth analysis, we investigate the benefits of utilizing image-semantic information to optimize the knowledge graph distribution for various models, highlighting the effectiveness of the method.
What problem does this paper attempt to address?