Cross-Modal Content Inference and Feature Enrichment for Cold-Start Recommendation

Haokai Ma,Zhuang Qi,Xinxin Dong,Xiangxian Li,Yuze Zheng,Xiangxu Mengand Lei Meng
2023-07-06
Abstract:Multimedia recommendation aims to fuse the multi-modal information of items for feature enrichment to improve the recommendation performance. However, existing methods typically introduce multi-modal information based on collaborative information to improve the overall recommendation precision, while failing to explore its cold-start recommendation performance. Meanwhile, these above methods are only applicable when such multi-modal data is available. To address this problem, this paper proposes a recommendation framework, named Cross-modal Content Inference and Feature Enrichment Recommendation (CIERec), which exploits the multi-modal information to improve its cold-start recommendation performance. Specifically, CIERec first introduces image annotation as the privileged information to help guide the mapping of unified features from the visual space to the semantic space in the training phase. And then CIERec enriches the content representation with the fusion of collaborative, visual, and cross-modal inferred representations, so as to improve its cold-start recommendation performance. Experimental results on two real-world datasets show that the content representations learned by CIERec are able to achieve superior cold-start recommendation performance over existing visually-aware recommendation algorithms. More importantly, CIERec can consistently achieve significant improvements with different conventional visually-aware backbones, which verifies its universality and effectiveness.
Information Retrieval
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the cold start problem in recommendation systems, particularly in multimedia recommendations. Specifically, existing methods typically utilize multimodal information to improve overall recommendation accuracy but fail to fully explore cold start recommendation performance. Moreover, these methods are only effective when multimodal data is available. To address these issues, the authors propose a new framework called **CIERec (Cross-modal Content Inference and Feature Enrichment Recommendation)**. #### Main Objectives: 1. **Utilize Cross-modal Inference**: By introducing image annotations as privileged information, guide the unified feature mapping from the visual space to the semantic space during the training phase. 2. **Integrate Multimodal Information**: Combine collaborative, visual, and cross-modal inferred representations to enhance content representation, thereby improving cold start recommendation performance. 3. **Improve Stability and Accuracy**: Mitigate the issue of missing heterogeneous modalities in cold start recommendations through cross-modal inference, thus enhancing the stability and accuracy of existing visual-aware cold start recommendation models. ### Summary This paper proposes a new framework, CIERec, which enhances the performance of cold start recommendation systems through cross-modal inference and multimodal information integration. It maintains good performance even in the absence of certain modal information. Experimental results show that CIERec outperforms existing visual-aware recommendation algorithms on multiple real-world datasets.