SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems

Nathan Corecco,Giorgio Piatti,Luca A. Lanzendörfer,Flint Xiaofeng Fan,Roger Wattenhofer

2024-08-20

Abstract:Reinforcement learning (RL) has gained popularity in the realm of recommender systems due to its ability to optimize long-term rewards and guide users in discovering relevant content. However, the successful implementation of RL in recommender systems is challenging because of several factors, including the limited availability of online data for training on-policy methods. This scarcity requires expensive human interaction for online model training. Furthermore, the development of effective evaluation frameworks that accurately reflect the quality of models remains a fundamental challenge in recommender systems. To address these challenges, we propose a comprehensive framework for synthetic environments that simulate human behavior by harnessing the capabilities of large language models (LLMs). We complement our framework with in-depth ablation studies and demonstrate its effectiveness with experiments on movie and book recommendations. Using LLMs as synthetic users, this work introduces a modular and novel framework to train RL-based recommender systems. The software, including the RL environment, is publicly available on GitHub.

Information Retrieval,Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address several major challenges faced when applying Reinforcement Learning (RL) in recommendation systems: 1. **Data Availability**: RL algorithms require a large amount of interaction data with the environment to learn effective strategies. However, in practical applications of recommendation systems, if users receive random or irrelevant recommendations, they may quickly abandon the service. This makes it impractical to collect a large amount of training data without compromising user experience. 2. **Unknown User Model**: In RL, the reward function is crucial for the model's learning. In the context of recommendation systems, designing a synthetic reward function that accurately reflects user satisfaction or preferences is challenging because modeling human behavior is highly complex. 3. **Model Evaluation**: A key challenge in recommendation systems is evaluating model performance without directly interacting with real users, thereby avoiding any potential negative impact on user experience. On the other hand, evaluations based on offline data cannot guarantee recommendation performance in the real world. To address these challenges, the authors propose a new framework called "Simulated User Behavior for Recommender Systems" (SUBER). SUBER is a synthetic environment framework that uses large language models (LLMs) to simulate human behavior. By using LLMs to simulate user behavior, SUBER can not only generate synthetic data but also leverage the capabilities of LLMs to replicate individual behaviors with unknown patterns. Additionally, this dynamic environment can serve as a model evaluation tool for recommendation systems.

SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems

Lusifer: LLM-based User SImulated Feedback Environment for online Recommender systems

Toward Simulating Environments in Reinforcement Learning Based Recommendations.

RL4RS: A Real-World Dataset for Reinforcement Learning based Recommender System

RAH! RecSys-Assistant-Human: A Human-Centered Recommendation Framework with LLM Agents

RecoGym: A Reinforcement Learning Environment for the problem of Product Recommendation in Online Advertising

EasyRL4Rec: An Easy-to-use Library for Reinforcement Learning Based Recommender Systems

A LLM-based Controllable, Scalable, Human-Involved User Simulator Framework for Conversational Recommender Systems

Self-Supervised Reinforcement Learning for Recommender Systems

Deep Reinforcement Learning for List-wise Recommendations

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

KuaiSim: A Comprehensive Simulator for Recommender Systems

Sim-to-Real Interactive Recommendation via Off-Dynamics Reinforcement Learning

A stable deep reinforcement learning framework for recommendation

Optimized Recommender Systems with Deep Reinforcement Learning

Robust Reinforcement Learning Objectives for Sequential Recommender Systems

AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems

A Survey on Reinforcement Learning for Recommender Systems

Generative Adversarial User Model for Reinforcement Learning Based Recommendation System

An Extremely Data-efficient and Generative LLM-based Reinforcement Learning Agent for Recommenders