Finetuning Large Language Model for Personalized Ranking

Zhuoxi Bai,Ning Wu,Fengyu Cai,Xinyi Zhu,Yun Xiong

2024-06-20

Abstract:Large Language Models (LLMs) have demonstrated remarkable performance across various domains, motivating researchers to investigate their potential use in recommendation systems. However, directly applying LLMs to recommendation tasks has proven challenging due to the significant disparity between the data used for pre-training LLMs and the specific requirements of recommendation tasks. In this study, we introduce Direct Multi-Preference Optimization (DMPO), a streamlined framework designed to bridge the gap and enhance the alignment of LLMs for recommendation tasks. DMPO enhances the performance of LLM-based recommenders by simultaneously maximizing the probability of positive samples and minimizing the probability of multiple negative samples. We conducted experimental evaluations to compare DMPO against traditional recommendation methods and other LLM-based recommendation approaches. The results demonstrate that DMPO significantly improves the recommendation capabilities of LLMs across three real-world public datasets in few-shot scenarios. Additionally, the experiments indicate that DMPO exhibits superior generalization ability in cross-domain recommendations. A case study elucidates the reasons behind these consistent improvements and also underscores DMPO's potential as an explainable recommendation system.

Information Retrieval

What problem does this paper attempt to address?

The paper focuses on how to effectively apply Large Language Models (LLMs) to recommendation systems. The current problem lies in the significant differences between pre-training data and recommendation tasks, which limits their performance in recommendation tasks. The paper proposes the Direct Multi-Preference Optimization (DMPO) framework to narrow this gap and enhance the adaptability of LLMs to recommendation tasks. DMPO improves the performance of LLM-based recommenders by simultaneously maximizing the probability of positive samples and minimizing the probability of multiple negative samples. Compared to traditional recommendation methods and Supervised Fine-Tuning (SFT) methods that rely solely on positive samples, experiments on three real-world public datasets show that DMPO significantly improves the recommendation ability of LLMs in scenarios with limited samples, and exhibits stronger generalization ability in cross-domain recommendation. The paper also conducts case studies to explain the reasons for the performance improvement of DMPO, and emphasizes its potential as an interpretable recommendation system. In addition, the authors provide code and data for further research. In summary, this paper aims to solve the problem of how to effectively utilize LLMs for personalized recommendation to improve the accuracy and generalization performance of recommendation systems.

Finetuning Large Language Model for Personalized Ranking

Aligning Large Language Model with Direct Multi-Preference Optimization for Recommendation

RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation

Personalized Recommendation Systems Powered By Large Language Models: Integrating Semantic Understanding and User Preferences

One Model for All: Large Language Models Are Domain-Agnostic Recommendation Systems

TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation

Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation

Democratizing Large Language Models via Personalized Parameter-Efficient Fine-tuning

Leveraging Large Language Models to Enhance Personalized Recommendations in E-commerce

Bridging the Information Gap Between Domain-Specific Model and General LLM for Personalized Recommendation

On Softmax Direct Preference Optimization for Recommendation

Aligning Large Language Models with Recommendation Knowledge

PALR: Personalization Aware LLMs for Recommendation

LLMRec: Benchmarking Large Language Models on Recommendation Task

An Integrated Model Based on Deep Multimodal and Rank Learning for Point-of-interest Recommendation

LlamaRec: Two-Stage Recommendation using Large Language Models for Ranking

Contrastive Learning-optimized Recommendation Data to Construct a Language Model Recommendation

Make Large Language Model a Better Ranker

Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model

ChainRank-DPO: Chain Rank Direct Preference Optimization for LLM Rankers