Content-Based Collaborative Generation for Recommender Systems

Yidan Wang,Zhaochun Ren,Weiwei Sun,Jiyuan Yang,Zhixiang Liang,Xin Chen,Ruobing Xie,Su Yan,Xu Zhang,Pengjie Ren,Zhumin Chen,Xin Xin
DOI: https://doi.org/10.1145/3627673.3679692
2024-11-12
Abstract:Generative models have emerged as a promising utility to enhance recommender systems. It is essential to model both item content and user-item collaborative interactions in a unified generative framework for better recommendation. Although some existing large language model (LLM)-based methods contribute to fusing content information and collaborative signals, they fundamentally rely on textual language generation, which is not fully aligned with the recommendation task. How to integrate content knowledge and collaborative interaction signals in a generative framework tailored for item recommendation is still an open research challenge. In this paper, we propose content-based collaborative generation for recommender systems, namely ColaRec. ColaRec is a sequence-to-sequence framework which is tailored for directly generating the recommended item identifier. Precisely, the input sequence comprises data pertaining to the user's interacted items, and the output sequence represents the generative identifier (GID) for the suggested item. To model collaborative signals, the GIDs are constructed from a pretrained collaborative filtering model, and the user is represented as the content aggregation of interacted items. To this end, ColaRec captures both collaborative signals and content information in a unified framework. Then an item indexing task is proposed to conduct the alignment between the content-based semantic space and the interaction-based collaborative space. Besides, a contrastive loss is further introduced to ensure that items with similar collaborative GIDs have similar content representations. To verify the effectiveness of ColaRec, we conduct experiments on four benchmark datasets. Empirical results demonstrate the superior performance of ColaRec.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively integrate content knowledge and collaborative interaction signals in the recommendation system to improve the recommendation performance. Although the existing methods based on large - language models (LLMs) can fuse content information and collaborative signals, they mainly rely on text generation, which is not fully aligned with the recommendation task, resulting in limitations in the recommendation effect. Specifically, these methods perform poorly when generating target item IDs from a large number of candidate pools, and there is a non - trivial grounding stage when mapping the generated language to specific items. In addition, existing methods still face challenges in effectively modeling collaborative signals and content information under a unified framework. To address these challenges, the paper proposes a content - based collaborative generation recommendation system named ColaRec. ColaRec is a sequence - to - sequence framework specifically designed to directly generate the identifiers (GID) of recommended items. By constructing GID to capture collaborative signals and representing users as an aggregation of the content of their historically interacted items, ColaRec can capture collaborative signals and content information in a unified framework. In addition, the paper also introduces an auxiliary item indexing task and a contrastive loss to ensure that items with similar collaborative GIDs also have similar content representations in the content - based semantic space, thereby achieving better alignment. In summary, the core problem of this paper is to propose a new method that can effectively integrate content knowledge and user - item collaborative signals in the generation model of the recommendation system, thereby enhancing the recommendation performance.