Dual-view multi-modal contrastive learning for graph-based recommender systems
Feipeng Guo,Zifan Wang,Xiaopeng Wang,Qibei Lu,Shaobo Ji
DOI: https://doi.org/10.1016/j.compeleceng.2024.109213
IF: 4.152
2024-03-31
Computers & Electrical Engineering
Abstract:Personalized recommender systems play a crucial role in various online content-sharing platforms (e.g., TikTok). The learning of representations for multi-modal content is pivotal in current graph-based recommender systems. Existing works aim to enhance recommendation accuracy by leveraging multi-modal features (e.g., image, sound, text) as side information for items. However, this approach falls short in fully discerning users' fine-grained preferences across different modalities. To tackle this limitation, this paper introduces the Dual-view Multi-Modal contrastive learning Recommendation model (DMM-Rec). DMM-Rec employs self-supervised learning to guide the learning of user and item representations within the multi-modal context. Specifically, to capture users' preferences for different modalities, we propose specific-modal contrastive learning. Simultaneously, to capture users' cross-modal preferences, cross-modal contrastive learning is introduced to uncover interdependencies in users' preferences across modalities. The contrastive learning tasks not only adaptively explore potential relations between modalities but also address the data sparsity challenge in recommender systems. Extensive experiments conducted on three datasets and compared against ten baselines demonstrate that DMM-Rec outperforms the strongest baseline by an average of 6.81%. These results underscore the effectiveness of considering multi-modal content in improving recommender systems.
engineering, electrical & electronic,computer science, interdisciplinary applications, hardware & architecture