CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation

Yang Zhang,Fuli Feng,Jizhi Zhang,Keqin Bao,Qifan Wang,Xiangnan He
2024-10-24
Abstract:Leveraging Large Language Models as Recommenders (LLMRec) has gained significant attention and introduced fresh perspectives in user preference modeling. Existing LLMRec approaches prioritize text semantics, usually neglecting the valuable collaborative information from user-item interactions in recommendations. While these text-emphasizing approaches excel in cold-start scenarios, they may yield sub-optimal performance in warm-start situations. In pursuit of superior recommendations for both cold and warm start scenarios, we introduce CoLLM, an innovative LLMRec methodology that seamlessly incorporates collaborative information into LLMs for recommendation. CoLLM captures collaborative information through an external traditional model and maps it to the input token embedding space of LLM, forming collaborative embeddings for LLM usage. Through this external integration of collaborative information, CoLLM ensures effective modeling of collaborative information without modifying the LLM itself, providing the flexibility to employ various collaborative information modeling techniques. Extensive experiments validate that CoLLM adeptly integrates collaborative information into LLMs, resulting in enhanced recommendation performance. We release the code and data at <a class="link-external link-https" href="https://github.com/zyang1580/CoLLM" rel="external noopener nofollow">this https URL</a>.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively integrate collaborative information into large language models (LLMs) in recommendation systems to optimize the recommendation performance in cold - start and warm - start scenarios. Existing LLMRec methods mainly focus on text semantics and ignore the collaborative information extracted from user - item interactions, which may lead to performance in warm - start scenarios that is not as good as that of traditional recommendation models. Therefore, the paper proposes a new research question: how to efficiently integrate collaborative information into LLMs to optimize their performance in cold - start and warm - start users/items? Specifically, the paper points out that existing methods use text tokens when representing users and items and mainly rely on text semantics for recommendation, which essentially cannot capture collaborative information. For example, two items with similar text descriptions may have different collaborative information if they are consumed by different users, but this difference is often ignored due to text similarity. However, the collaborative information between users and items is particularly beneficial for recommendation, especially in the case of rich interaction data (i.e., warm - start). Ignoring this information may lead to sub - optimal performance. Therefore, the paper proposes a new method - CoLLM, which captures collaborative information through the capabilities of external traditional models and effectively integrates it into LLMs, thereby improving the performance of the recommendation system without modifying the LLM itself.