LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

Zhenrui Yue,Sara Rabhi,Gabriel de Souza Pereira Moreira,Dong Wang,Even Oldridge
2023-10-25
Abstract:Recently, large language models (LLMs) have exhibited significant progress in language understanding and generation. By leveraging textual features, customized LLMs are also applied for recommendation and demonstrate improvements across diverse recommendation scenarios. Yet the majority of existing methods perform training-free recommendation that heavily relies on pretrained knowledge (e.g., movie recommendation). In addition, inference on LLMs is slow due to autoregressive generation, rendering existing methods less effective for real-time recommendation. As such, we propose a two-stage framework using large language models for ranking-based recommendation (LlamaRec). In particular, we use small-scale sequential recommenders to retrieve candidates based on the user interaction history. Then, both history and retrieved items are fed to the LLM in text via a carefully designed prompt template. Instead of generating next-item titles, we adopt a verbalizer-based approach that transforms output logits into probability distributions over the candidate items. Therefore, the proposed LlamaRec can efficiently rank items without generating long text. To validate the effectiveness of the proposed framework, we compare against state-of-the-art baseline methods on benchmark datasets. Our experimental results demonstrate the performance of LlamaRec, which consistently achieves superior performance in both recommendation performance and efficiency.
Information Retrieval,Artificial Intelligence,Computation and Language
What problem does this paper attempt to address?
The paper aims to address the efficiency issues faced by large-scale language models (LLMs) in recommendation systems, particularly in real-time recommendation scenarios. Specifically, existing LLM-based recommendation methods mostly rely on pre-trained knowledge for training-free recommendations, and due to autoregressive generation, the inference speed is slow, which limits the effectiveness of these methods in practical applications. To this end, the authors propose a new framework called LlamaRec, which adopts a two-stage recommendation strategy: 1. **Retrieval Stage**: Utilize a small sequence recommender (e.g., LRURec) to quickly retrieve candidate items. 2. **Ranking Stage**: Convert user history and retrieved candidate items into text input, and use a carefully designed prompt template to efficiently rank them using an LLM (specifically the Llama 2 model). Unlike traditional methods that generate the next item title, LlamaRec employs a verbalizer-based approach to convert output logits into a probability distribution of candidate items, thereby avoiding the high latency associated with long text generation. Experimental results show that LlamaRec significantly outperforms existing methods on multiple benchmark datasets, improving both recommendation performance and inference efficiency. Additionally, LlamaRec performs exceptionally well when compared to LLM-based baseline methods, particularly on the Beauty dataset. Finally, by comparing inference times under different average title lengths, it is verified that LlamaRec is more efficient than traditional generation methods.