A Multimodal Resource Balancing Approach Based on Cross-Attention and Bidrectional Associative Memory Networks

Xingang Wang,Honglu D. Cheng,Guangzheng Liu,Xiaoyu Liu
DOI: https://doi.org/10.1117/12.3010753
2023-01-01
Abstract:In the event of a disaster, mining social media tweets containing disaster information to analyze the dynamics of the disaster can help relevant authorities make quick emergency decisions and public opinion analysis. The data on social media jointly describe a thing to make it semantically related, but there are often structural and semantic imbalances between heterogeneous modalities. Most of the current research is devoted to formal complementation or semantic balancing, and multi-granularity resource balancing is beneficial to better obtain consistent information of heterogeneous modalities and remove redundancy. Based on the above problems, this paper proposes an end-to-end Multimodal Resource Balancing (MMRB) model, which designs cross-attention and two-way associative memory network modules while avoiding the loss of important information of unimodal modalities. The cross-attention module is used to capture the deep semantic correlations between modalities and weight the modal features to achieve semantic-level resource balancing. The associative memory module uses each modal feature to generate a complete feature representation of another modality on the pre-trained migration model to complement multimodal information at the feature level to improve the modal imbalance and modal missing problem of unstructured social media data. Finally, the fused features are weighted by the joint feature codec to reasonably focus on the contribution of different modal features in the joint features and the intrinsic association. Experiments are conducted on CrisisMMD, a public data of social media graphics used for disaster detection, and the accuracy of the model is stronger than the current multimodal resource balancing model, which verifies the effectiveness of our model.
What problem does this paper attempt to address?