MVAIBNet: Multiview Disentangled Representation Learning with Information Bottleneck
Ming Yin,Xin Liu,Junli Gao,Haoliang Yuan,Taisong Jin,Shengwei Zhang,Lingling Li
DOI: https://doi.org/10.1109/tii.2024.3397357
2024-01-01
Abstract:Multiview representation learning has recently attracted significant attention in the machine learning and computer vision community. However, during fusing information from multiple views, existing work often neglect to exploit the complementary information in each view and endow with the interpretability of model. To this end, in this article, we propose a multiview attention fusion information bottleneck network, termed by MVAIBNet. Specifically, MVAIBNet deliberately develops dual-path reconstructions to extract latent embeddings of views, where view-peculiar representations are disentangled from the embeddings using beta-VAE, to help reconstruct each view. Then, to align and fuse view-common representations, an attention fusion unit, namely the multiview channel fusion unit (MVCFU), is presented accordingly. Furthermore, relying on the information bottleneck principle, we integrate the consistency information and specificity information of the views to prompt a compact semantic representation of multiple views with balancing the complementarity and consistency among multiple views flexibly. Extensive experimental results on four real-world datasets show that our algorithm achieves encouraging performance on several evaluation metrics compared to the state-of-the-art methods.