IBMEA: Exploring Variational Information Bottleneck for Multi-modal Entity Alignment

Taoyu Su,Jiawei Sheng,Shicheng Wang,Xinghua Zhang,Hongbo Xu,Tingwen Liu
2024-07-28
Abstract:Multi-modal entity alignment (MMEA) aims to identify equivalent entities between multi-modal knowledge graphs (MMKGs), where the entities can be associated with related images. Most existing studies integrate multi-modal information heavily relying on the automatically-learned fusion module, rarely suppressing the redundant information for MMEA explicitly. To this end, we explore variational information bottleneck for multi-modal entity alignment (IBMEA), which emphasizes the alignment-relevant information and suppresses the alignment-irrelevant information in generating entity representations. Specifically, we devise multi-modal variational encoders to generate modal-specific entity representations as probability distributions. Then, we propose four modal-specific information bottleneck regularizers, limiting the misleading clues in refining modal-specific entity representations. Finally, we propose a modal-hybrid information contrastive regularizer to integrate all the refined modal-specific representations, enhancing the entity similarity between MMKGs to achieve MMEA. We conduct extensive experiments on two cross-KG and three bilingual MMEA datasets. Experimental results demonstrate that our model consistently outperforms previous state-of-the-art methods, and also shows promising and robust performance in low-resource and high-noise data scenarios.
Computation and Language,Multimedia
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of Multi-modal Entity Alignment (MMEA). Specifically: 1. **Background and Challenges**: - Multi-modal Knowledge Graphs (MMKGs) extend traditional knowledge graphs by introducing multi-modal data (such as images) related to entities, thereby giving traditional knowledge graphs symbolic meanings of the physical world. - Existing methods mostly rely on automatically learned fusion modules to integrate multi-modal information, but rarely explicitly suppress redundant information that is useless for MMEA. 2. **Objectives**: - To achieve MMEA by exploring the Variational Information Bottleneck (VIB) method, emphasizing alignment-related information and suppressing alignment-unrelated information. - Propose a new framework, IBMEA, which utilizes the information bottleneck principle to emphasize alignment-related information and suppress alignment-unrelated information when generating entity representations. 3. **Specific Methods**: - Design a multi-modal variational encoder to generate modality-specific entity representations as probability distributions. - Propose four modality-specific information bottleneck regularizers to constrain misleading cues, thereby optimizing modality-specific entity representations. - Introduce a modality-mixed information contrastive regularizer to integrate all optimized modality-specific representations, enhancing entity similarity to achieve MMEA. Through the above methods, the paper hopes to achieve better performance in low-resource and high-noise data scenarios. Experimental results show that the model outperforms existing methods on multiple benchmarks.