Abstract:Recent advancements have showcased the potential of Large Language Models (LLMs) in executing reasoning tasks, particularly facilitated by Chain-of-Thought (CoT) prompting. While tasks like arithmetic reasoning involve clear, definitive answers and logical chains of thought, the application of LLM reasoning in recommendation systems (RecSys) presents a distinct challenge. RecSys tasks revolve around subjectivity and personalized preferences, an under-explored domain in utilizing LLMs' reasoning capabilities. Our study explores several aspects to better understand reasoning for RecSys and demonstrate how task quality improves by utilizing LLM reasoning in both zero-shot and finetuning settings. Additionally, we propose RecSAVER (Recommender Systems Automatic Verification and Evaluation of Reasoning) to automatically assess the quality of LLM reasoning responses without the requirement of curated gold references or human raters. We show that our framework aligns with real human judgment on the coherence and faithfulness of reasoning responses. Overall, our work shows that incorporating reasoning into RecSys can improve personalized tasks, paving the way for further advancements in recommender system methodologies.

What problem does this paper attempt to address?

The paper primarily explores how to leverage the reasoning capabilities of large language models (LLMs) to enhance the performance of personalized recommender systems (RecSys). Specifically, the research focuses on the following aspects: 1. **Research Background and Motivation**: Although LLMs have shown great potential in performing logical reasoning tasks, especially under the guidance of Chain-of-Thought (CoT), their application in the field of recommender systems, particularly in handling subjectivity and personalized preferences, remains an underexplored area. 2. **Research Objectives**: The paper aims to explore how to utilize the reasoning capabilities of LLMs to improve the performance of personalized tasks in recommender systems and proposes an automatic method for evaluating reasoning quality—the Rec-SAVER framework, which does not require pre-prepared gold standard references or human evaluators. 3. **Methodology**: - **Reasoning in Zero-Shot Learning**: Guiding LLMs to generate reasoning responses through zero-shot CoT prompting strategies and making predictions based on user history and item metadata. - **Fine-Tuning with Reasoning**: Collecting reasoning outputs generated by large language models as training data to fine-tune smaller pre-trained models to enhance their task performance. 4. **Evaluation Method**: The Rec-SAVER framework is proposed to automatically evaluate the quality of reasoning responses generated by LLMs. This framework first generates a series of posterior explanations based on user history and recommended items, then filters out high-quality explanations through a self-verification process to serve as reference standards, thereby quantifying the quality of reasoning responses generated by other LLMs. 5. **Experimental Results**: Experiments on the Amazon product review dataset demonstrate that introducing reasoning steps significantly improves the performance of recommender system tasks in both zero-shot learning and fine-tuning with reasoning scenarios. Additionally, comparisons of different model sizes and analyses of the effects of different filtering methods were conducted. In summary, the main contribution of this paper lies in demonstrating how to effectively utilize the reasoning capabilities of LLMs to improve the personalized recommendation performance of recommender systems and proposing a new method for automatically evaluating reasoning quality.

Leveraging LLM Reasoning Enhances Personalized Recommender Systems

ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning

Leveraging Large Language Models for Pre-trained Recommender Systems

Enhancing Recommender Systems with Large Language Model Reasoning Graphs

LLMRG: Improving Recommendations through Large Language Model Reasoning Graphs

LLM Reasoners: New Evaluation, Library, and Analysis of Step-by-Step Reasoning with Large Language Models

When Do Program-of-Thought Works for Reasoning?

DRDT: Dynamic Reflection with Divergent Thinking for LLM-based Sequential Recommendation

An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration

RDRec: Rationale Distillation for LLM-based Recommendation

CoRAL: Collaborative Retrieval-Augmented Large Language Models Improve Long-tail Recommendation

Recommender Systems in the Era of Large Language Models (LLMs)

LLM-Powered Explanations: Unraveling Recommendations Through Subgraph Reasoning

Improving LLM Reasoning through Scaling Inference Computation with Collaborative Verification

Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation

All Roads Lead to Rome: Unveiling the Trajectory of Recommender Systems Across the LLM Era

LLMs for Relational Reasoning: How Far are We?

Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning

Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search