Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling

Jie Wang,Alexandros Karatzoglou,Ioannis Arapakis,Joemon M. Jose

2024-03-26

Abstract:Reinforcement Learning (RL)-based recommender systems have demonstrated promising performance in meeting user expectations by learning to make accurate next-item recommendations from historical user-item interactions. However, existing offline RL-based sequential recommendation methods face the challenge of obtaining effective user feedback from the environment. Effectively modeling the user state and shaping an appropriate reward for recommendation remains a challenge. In this paper, we leverage language understanding capabilities and adapt large language models (LLMs) as an environment (LE) to enhance RL-based recommenders. The LE is learned from a subset of user-item interaction data, thus reducing the need for large training data, and can synthesise user feedback for offline data by: (i) acting as a state model that produces high quality states that enrich the user representation, and (ii) functioning as a reward model to accurately capture nuanced user preferences on actions. Moreover, the LE allows to generate positive actions that augment the limited offline training data. We propose a LE Augmentation (LEA) method to further improve recommendation performance by optimising jointly the supervised component and the RL policy, using the augmented actions and historical user signals. We use LEA, the state and reward models in conjunction with state-of-the-art RL recommenders and report experimental results on two publicly available datasets.

Information Retrieval

What problem does this paper attempt to address?

The paper aims to address some key issues in recommendation systems, particularly those encountered when using Reinforcement Learning (RL) for sequential recommendations. Specifically, the paper attempts to solve the following major problems: 1. **Effective User Feedback Acquisition**: Existing offline RL methods face challenges in obtaining effective user feedback from the environment, especially in modeling user states and devising appropriate reward mechanisms. 2. **Reducing Training Data Requirements**: By leveraging language understanding capabilities and adapting large language models (LLMs) as part of the environment to enhance RL recommenders, the paper aims to learn from a subset of user-item interaction data, thereby reducing the need for large amounts of training data. 3. **Generating High-Quality User States and Rewards**: LLMs can serve as state models to produce high-quality state representations and as reward models to capture the nuances of user behavior, thereby improving recommendation accuracy. 4. **Enhancing Limited Offline Data**: The paper proposes a LEA method to further enhance recommendation performance by optimizing supervised components and RL policies, utilizing enhanced actions and historical user signals to achieve this goal. In summary, this research is dedicated to improving the performance of RL-based recommendation systems by integrating the capabilities of large language models, particularly for sequential recommendation tasks in offline settings.

Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling

Representation Learning with Large Language Models for Recommendation

LLaRA: Aligning Large Language Models with Sequential Recommenders.

RLRF4Rec: Reinforcement Learning from Recsys Feedback for Enhanced Recommendation Reranking

Large Language Models Make Sample-Efficient Recommender Systems

Large Language Models are Learnable Planners for Long-Term Recommendation

Contrastive State Augmentations for Reinforcement Learning-Based Recommender Systems

Large Language Model Can Interpret Latent Space of Sequential Recommender

Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review

Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation

LLaRA: Large Language-Recommendation Assistant

Improving Sequential Recommendations with LLMs

Leveraging Large Language Models for Pre-trained Recommender Systems

Model-enhanced Contrastive Reinforcement Learning for Sequential Recommendation

Large Language Models for Recommendation: Past, Present, and Future

Enhancing Recommender Systems with Large Language Model Reasoning Graphs

Recommender Systems in the Era of Large Language Models (LLMs)

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

A survey on large language models for recommendation

CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation