Multi-modal visual adversarial Bayesian personalized ranking model for recommendation
Guangli Li,Jianwu Zhuo,Chuanxiu Li,Jin Hua,Tian Yuan,Zhengyu Niu,Donghong Ji,Renzhong Wu,Hongbin Zhang
DOI: https://doi.org/10.1016/j.ins.2021.05.022
IF: 8.1
2021-01-01
Information Sciences
Abstract:Recommendation system is facing the "data sparseness" issue. Additional information, including images, texts, and videos, contributes to alleviating this issue. We propose a new multi-modal visual adversarial Bayesian personalized ranking (MVABPR) model to address the issue. The proposed model takes new features, cross-modal semantics, adversarial learning, and visual interface into account. Two multi-modal datasets are created based on the MovieLens datasets and the correlated images. Besides the shape, texture, color, and deep learning-based features, a set of efficient match kernel features are proposed. More discriminative but low-dimensional cross-modal semantics among these features is mined to characterize each item effectively, which is absorbed into the MVABPR model through a visual interface. A new adversarial learning strategy is employed to optimize the whole training procedure. This makes the MVABPR model more robust and stable. Experimental results demonstrate that the MVABPR model is effective and robust for recommendation. It outperforms other competitive baselines. As another advantage, it can learn visual information and users' rating jointly, effectively, combined with adversarial learning. And the implicit feeling tone of a recommended item can be accurately captured by the proposed model. More importantly, the model achieves better performance on a large-scale sparser dataset, demonstrating its higher practicality. (c) 2021 Elsevier Inc. All rights reserved.