Collaborative Cross-modal Fusion with Large Language Model for Recommendation

Zhongzhou Liu,Hao Zhang,Kuicai Dong,Yuan Fang

DOI: https://doi.org/10.1145/3627673.3679596

2024-08-16

Abstract:Despite the success of conventional collaborative filtering (CF) approaches for recommendation systems, they exhibit limitations in leveraging semantic knowledge within the textual attributes of users and items. Recent focus on the application of large language models for recommendation (LLM4Rec) has highlighted their capability for effective semantic knowledge capture. However, these methods often overlook the collaborative signals in user behaviors. Some simply instruct-tune a language model, while others directly inject the embeddings of a CF-based model, lacking a synergistic fusion of different modalities. To address these issues, we propose a framework of Collaborative Cross-modal Fusion with Large Language Models, termed CCF-LLM, for recommendation. In this framework, we translate the user-item interactions into a hybrid prompt to encode both semantic knowledge and collaborative signals, and then employ an attentive cross-modal fusion strategy to effectively fuse latent embeddings of both modalities. Extensive experiments demonstrate that CCF-LLM outperforms existing methods by effectively utilizing semantic and collaborative signals in the LLM4Rec context.

Information Retrieval,Computation and Language

What problem does this paper attempt to address?

The paper aims to address the following issues: 1. **Problems with traditional Collaborative Filtering (CF) methods**: Although traditional collaborative filtering recommendation systems have achieved significant success in various fields, they have limitations in utilizing the semantic knowledge in user and item textual attributes. Traditional CF models struggle to handle the rich semantic knowledge in user and item textual features. 2. **Shortcomings of applying Large Language Models (LLM) to recommendation systems**: While LLMs excel at capturing semantic knowledge, relying solely on semantic relevance is insufficient to model user preferences. Existing attempts to transform user-item interactions into natural language descriptions or directly inject embeddings into collaborative filtering models have limited effectiveness, failing to effectively integrate semantic knowledge with collaborative signals. To address these issues, the authors propose a method called **Collaborative Cross-Modal Fusion with Large Language Model framework (CCF-LLM)**. This method converts user-item interactions into hybrid prompts containing both semantic knowledge and collaborative signals, and employs an attentive cross-modal fusion strategy to effectively integrate the latent embeddings of both modalities, thereby better utilizing semantic and collaborative signals. Experimental results show that CCF-LLM outperforms existing methods in recommendation tasks.

Collaborative Cross-modal Fusion with Large Language Model for Recommendation

Large Language Models Enhanced Collaborative Filtering

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Collaborative Large Language Model for Recommender Systems

Large Language Models meet Collaborative Filtering: An Efficient All-round LLM-based Recommender System

CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation

Collaborative Knowledge Fusion: A Novel Approach for Multi-task Recommender Systems via LLMs

CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation

Integrating Large Language Models into Recommendation via Mutual Augmentation and Adaptive Aggregation

NoteLLM-2: Multimodal Large Representation Models for Recommendation

Triple Modality Fusion: Aligning Visual, Textual, and Graph Data with Large Language Models for Multi-Behavior Recommendations

Representation Learning with Large Language Models for Recommendation

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

Personalized Recommendation Systems Powered By Large Language Models: Integrating Semantic Understanding and User Preferences

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation

A Large Language Model Enhanced Conversational Recommender System

Text-like Encoding of Collaborative Information in Large Language Models for Recommendation

Large Language Model with Graph Convolution for Recommendation