Preference Discerning with LLM-Enhanced Generative Retrieval

Fabian Paischer,Liu Yang,Linfeng Liu,Shuai Shao,Kaveh Hassani,Jiacheng Li,Ricky Chen,Zhang Gabriel Li,Xialo Gao,Wei Shao,Xue Feng,Nima Noorshams,Sem Park,Bo Long,Hamid Eghbalzadeh
2024-12-12
Abstract:Sequential recommendation systems aim to provide personalized recommendations for users based on their interaction history. To achieve this, they often incorporate auxiliary information, such as textual descriptions of items and auxiliary tasks, like predicting user preferences and intent. Despite numerous efforts to enhance these models, they still suffer from limited personalization. To address this issue, we propose a new paradigm, which we term preference discerning. In preference dscerning, we explicitly condition a generative sequential recommendation system on user preferences within its context. To this end, we generate user preferences using Large Language Models (LLMs) based on user reviews and item-specific data. To evaluate preference discerning capabilities of sequential recommendation systems, we introduce a novel benchmark that provides a holistic evaluation across various scenarios, including preference steering and sentiment following. We assess current state-of-the-art methods using our benchmark and show that they struggle to accurately discern user preferences. Therefore, we propose a new method named Mender ($\textbf{M}$ultimodal Prefer$\textbf{en}$ce $\textbf{d}$iscern$\textbf{er}$), which improves upon existing methods and achieves state-of-the-art performance on our benchmark. Our results show that Mender can be effectively guided by human preferences even though they have not been observed during training, paving the way toward more personalized sequential recommendation systems. We will open-source the code and benchmarks upon publication.
Information Retrieval,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use users' preferences more effectively to provide personalized recommendations in sequential recommendation systems. Specifically, although existing sequential recommendation systems improve the degree of personalization by incorporating auxiliary information (such as item descriptions and auxiliary tasks), there is still a problem of insufficient personalization. Users' choices are often influenced by their preferences, and these preferences are usually not explicitly provided in commonly - used recommendation datasets, so they need to be inferred from users' interaction histories. However, current methods, when using generative models, are unable to dynamically adjust recommendation results according to users' contexts, resulting in poor generalization ability for new users and a lack of effective evaluation criteria to measure the performance of these models in discerning users' preferences. To solve these problems, the paper proposes a new paradigm named "preference discerning". The core of this paradigm is to explicitly condition on user preferences in generative sequential recommendation systems. To achieve this, the paper uses large - language models (LLMs) to generate user preferences based on user reviews and item - specific data, and proposes a new multimodal generative retrieval method - Mender (Multimodal Preference Discerner) to improve the performance of existing methods. Mender can effectively recommend specific items in a natural - language - guided manner without observing user - preference training, thus paving the way for more personalized sequential recommendation systems. In addition, the paper also introduces a new benchmark for comprehensively evaluating the preference - discerning ability of sequential recommendation systems, including five evaluation dimensions: preference recommendation, sentiment following, fine - grained navigation, coarse - grained navigation, and history integration. Through this benchmark, the paper evaluates the current state - of - the - art generative retrieval methods and shows their lack of ability in preference discerning, especially their poor performance when dealing with new - user sequences. Mender outperforms existing generative retrieval models in all evaluation dimensions, especially when dealing with synthetic user sequences.