Dual Low-Rank Multimodal Fusion
Tao Jin,Siyu Huang,Yingming Li,Zhongfei Zhang
DOI: https://doi.org/10.18653/v1/2020.findings-emnlp.35
2020-01-01
Abstract:Tensor-based fusion methods have been proven effective in multimodal fusion tasks. However, existing tensor-based methods make a poor use of the fine-grained temporal dynamics of multimodal sequential features. Motivated by this observation, this paper proposes a novel multimodal fusion method called FineGrained Temporal Low-Rank Multimodal Fusion (FT-LMF). FT-LMF correlates the features of individual time steps between multiple modalities, while it involves multiplications of high-order tensors in its calculation. This paper further proposes Dual Low-Rank Multimodal Fusion (Dual-LMF) to reduce the computational complexity of FT-LMF through low-rank tensor approximation along dual dimensions of input features. Dual-LMF is conceptually simple and practically effective and efficient. Empirical studies on benchmark multimodal analysis tasks show that our proposed methods outperform the state-of-the-art tensorbased fusion methods with a similar computational complexity.
What problem does this paper attempt to address?