Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Bowen Zheng,Yupeng Hou,Hongyu Lu,Yu Chen,Wayne Xin Zhao,Ming Chen,Ji-Rong Wen
DOI: https://doi.org/10.48550/arXiv.2311.09049
2024-04-19
Abstract:Recently, large language models (LLMs) have shown great potential in recommender systems, either improving existing recommendation models or serving as the backbone. However, there exists a large semantic gap between LLMs and recommender systems, since items to be recommended are often indexed by discrete identifiers (item ID) out of the LLM's vocabulary. In essence, LLMs capture language semantics while recommender systems imply collaborative semantics, making it difficult to sufficiently leverage the model capacity of LLMs for recommendation. To address this challenge, in this paper, we propose a new LLM-based recommendation model called LC-Rec, which can better integrate language and collaborative semantics for recommender systems. Our approach can directly generate items from the entire item set for recommendation, without relying on candidate items. Specifically, we make two major contributions in our approach. For item indexing, we design a learning-based vector quantization method with uniform semantic mapping, which can assign meaningful and non-conflicting IDs (called item indices) for items. For alignment tuning, we propose a series of specially designed tuning tasks to enhance the integration of collaborative semantics in LLMs. Our fine-tuning tasks enforce LLMs to deeply integrate language and collaborative semantics (characterized by the learned item indices), so as to achieve an effective adaptation to recommender systems. Extensive experiments demonstrate the effectiveness of our method, showing that our approach can outperform a number of competitive baselines including traditional recommenders and existing LLM-based recommenders. Our code is available at <a class="link-external link-https" href="https://github.com/RUCAIBox/LC-Rec/" rel="external noopener nofollow">this https URL</a>.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively utilize large - language models (LLMs) in recommendation systems to improve recommendation performance. Specifically, there is a large semantic gap. That is, LLMs capture language semantics, while recommendation systems imply collaborative semantics. Since items in recommendation systems are usually indexed by discrete identifiers (such as item IDs), and these identifiers are not in the vocabulary of LLMs, it is difficult to fully utilize the powerful model capabilities of LLMs to handle recommendation tasks. To meet this challenge, the paper proposes a new LLM - based recommendation model, LC - Rec, aiming to better integrate language and collaborative semantics to serve recommendation systems. The main contributions of LC - Rec lie in two aspects: 1. **Item Indexing**: A learning - based vector quantization method with uniform semantic mapping is designed, which can assign meaningful and non - conflicting IDs (called item indexes) to items. This method allows items to be represented by a small number of discrete indexes while capturing the intrinsic similarity between items. 2. **Alignment Fine - Tuning**: A series of specially designed fine - tuning tasks are proposed to enhance the integration of language and collaborative semantics in LLMs. These tasks include not only sequential item prediction, but also explicit index - language alignment tasks and implicit recommendation - oriented alignment tasks, ensuring that LLMs can deeply integrate these two semantics and thus effectively adapt to the requirements of recommendation systems. Through these two aspects of innovation, LC - Rec can generate recommended items directly from the entire item set without relying on the candidate set, significantly improving the performance of the recommendation system. Experimental results show that LC - Rec performs excellently on multiple benchmark datasets, with an average performance improvement of 25.5%.