LLM-Powered User Simulator for Recommender System

Zijian Zhang,Shuchang Liu,Ziru Liu,Rui Zhong,Qingpeng Cai,Xiangyu Zhao,Chunxu Zhang,Qidong Liu,Peng Jiang
2024-12-22
Abstract:User simulators can rapidly generate a large volume of timely user behavior data, providing a testing platform for reinforcement learning-based recommender systems, thus accelerating their iteration and optimization. However, prevalent user simulators generally suffer from significant limitations, including the opacity of user preference modeling and the incapability of evaluating simulation accuracy. In this paper, we introduce an LLM-powered user simulator to simulate user engagement with items in an explicit manner, thereby enhancing the efficiency and effectiveness of reinforcement learning-based recommender systems training. Specifically, we identify the explicit logic of user preferences, leverage LLMs to analyze item characteristics and distill user sentiments, and design a logical model to imitate real human engagement. By integrating a statistical model, we further enhance the reliability of the simulation, proposing an ensemble model that synergizes logical and statistical insights for user interaction simulations. Capitalizing on the extensive knowledge and semantic generation capabilities of LLMs, our user simulator faithfully emulates user behaviors and preferences, yielding high-fidelity training data that enrich the training of recommendation algorithms. We establish quantifying and qualifying experiments on five datasets to validate the simulator's effectiveness and stability across various recommendation scenarios.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve two main problems of user simulators in reinforcement learning (RL) recommendation systems: 1. **Opaque user preference modeling**: Existing user simulators are unable to explicitly model user preferences, resulting in difficulty in accurately predicting user choices. For example, some simulators use generative adversarial networks (GANs) or offline - trained Transformers to simulate the user interaction distribution, but these methods lack transparency. 2. **Lack of an effective evaluation framework**: Currently, there is a lack of an effective evaluation framework to measure the fidelity between simulated interactions and real - user behaviors. This makes it difficult to verify the effectiveness and reliability of user simulators. To solve these problems, the author introduces a user simulator based on large - language models (LLMs) to more efficiently and effectively simulate user interactions with the recommendation system. Specifically, the main objectives of this study are: - **Clarify user interaction logic**: Analyze item features and user sentiment, and use LLMs to infer the potential reasons for users' likes or dislikes of items. - **Construct a comprehensive model**: Combine the advantages of logical reasoning and statistical learning, and propose an ensemble model to improve the reliability and efficiency of user - interaction simulation. - **Provide comprehensive evaluation**: Conduct experiments on multiple public datasets to verify the effectiveness and stability of the user simulator in different recommendation scenarios. Through these improvements, this study aims to accelerate the iteration and optimization process of RL recommendation systems and provide high - quality training data, thereby improving the performance of recommendation algorithms. ### Key innovation points 1. **Explicit user preference modeling**: Analyze item features and user reviews through LLMs, extract keywords, and construct a logical model to explicitly represent user preferences. 2. **Reduce computational cost and the risk of hallucination**: Use LLMs to generate concise key reasons and combine with statistical models for regularization to reduce computational cost and the risk of hallucination. 3. **Multidimensional evaluation framework**: Conduct experiments on datasets in five different domains to ensure the universality and reliability of the user simulator. ### Formula summary - **Keyword matching model**: \[ \alpha_{\text{pos}}=\sum_{i \in I_{\text{pos}}}\left|D_{ic}^{\text{pos}} \cap D_i^{\text{pos}}\right| \] \[ \alpha_{\text{neg}}=\sum_{i \in I_{\text{neg}}}\left|D_{ic}^{\text{neg}} \cap D_i^{\text{neg}}\right| \] \[ f_{\text{mat}}(I_{\text{pos}}, I_{\text{neg}}, i_c)=\begin{cases} 1 & \text{if } \alpha_{\text{pos}}>\alpha_{\text{neg}} \\ \text{rand}\{0,1\} & \text{if } \alpha_{\text{pos}}=\alpha_{\text{neg}} \\ 0 & \text{if } \alpha_{\text{pos}}<\alpha_{\text{neg}} \end{cases} \] - **Similarity calculation model**: \[ E_{\text{pos}}=\text{AvePool}(\{\text{BERT}(d) \mid d \in D_{\text{pos}}\}) \] \[ E_{\text{neg}}=\text{AvePool}(\{\text{BERT}(d) \mid d \in D_{\text{neg}}\}) \]