Abstract:In online video platforms, reading or writing comments on interesting videos has become an essential part of the video watching experience. However, existing video recommender systems mainly model users' interaction behaviors with videos, lacking consideration of comments in user behavior modeling. In this paper, we propose a novel recommendation approach called LSVCR by leveraging user interaction histories with both videos and comments, so as to jointly conduct personalized video and comment recommendation. Specifically, our approach consists of two key components, namely sequential recommendation (SR) model and supplemental large language model (LLM) recommender. The SR model serves as the primary recommendation backbone (retained in deployment) of our approach, allowing for efficient user preference modeling. Meanwhile, we leverage the LLM recommender as a supplemental component (discarded in deployment) to better capture underlying user preferences from heterogeneous interaction behaviors. In order to integrate the merits of the SR model and the supplemental LLM recommender, we design a twostage training paradigm. The first stage is personalized preference alignment, which aims to align the preference representations from both components, thereby enhancing the semantics of the SR model. The second stage is recommendation-oriented fine-tuning, in which the alignment-enhanced SR model is fine-tuned according to specific objectives. Extensive experiments in both video and comment recommendation tasks demonstrate the effectiveness of LSVCR. Additionally, online A/B testing on the KuaiShou platform verifies the actual benefits brought by our approach. In particular, we achieve a significant overall gain of 4.13% in comment watch time.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is that on online video platforms, the existing video recommendation systems mainly focus on the interaction behaviors between users and videos, while ignoring the role of comments in user behavior modeling. With the growth of online video communities, users' comments on videos are becoming more and more important, because these comments not only provide supplementary information but also enhance the users' viewing experience. Therefore, this research aims to improve the recommendation quality and enhance user participation by integrating video and comment data. Specifically, the paper proposes a novel recommendation method named LSVCR (Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation). This method utilizes the historical interaction records of users with videos and comments to jointly perform personalized video and comment recommendations. To achieve this goal, LSVCR contains two key components: 1. **Sequential Recommendation model (SR model)**: As the main recommendation framework, it is retained in deployment and is used to efficiently model user preferences. 2. **Supplemental Large Language Model recommender (LLM recommender)**: It is used in the training stage to capture the potential preferences of users from different interaction behaviors and is discarded during deployment. To integrate the advantages of these two components, the paper designs a two - stage training paradigm: - **Stage 1: Personalized Preference Alignment**: The purpose is to align the preference representations from the two components, thereby enhancing the semantic understanding ability of the SR model. - **Stage 2: Recommendation - Oriented Fine - tuning**: Fine - tune the aligned SR model according to specific goals to improve the recommendation performance. The experimental results show that LSVCR exhibits significant effects in video and comment recommendation tasks, and the online A/B test verifies its effectiveness in actual industrial recommendation systems. In particular, in terms of comments, LSVCR achieves a 4.13% increase in viewing time and a 1.36% increase in the number of interactions. ### Formula Summary - **Text Feature Embedding**: \[ z_{v_i} = [\text{LLM}(t_i)\|\text{LLM}(c_i)]W_1, \] \[ z_{c_j} = [\text{LLM}(t_j)\|\text{MEAN}(\text{LLM}(c^1_j),...,\text{LLM}(c^k_j))]W_1. \] - **Sequence Representation Learning**: \[ H_v=\text{Transformer}_v(E_v + eP_v), \] \[ H_c=\text{Transformer}_c(E_c + eP_c). \] - **Preference Extraction**: \[ s_v = F_v(bH_v)=\sum_{i = 1}^n\alpha_i h^v_i,\quad\alpha_i=\frac{\exp(f(h^v_i))}{\sum_{k = 1}^n\exp(f(h^v_k))}, \] \[ s_c = F_c(bH_c)=\sum_{j = 1}^m\beta_j h^c_j,\quad\beta_j=\frac{\exp(g(e_t_{m + 1},h^c_j))}{\sum_{k = 1}^m\exp(g(e_t_{m + 1},h^c_k))}. \] - **Contrast Loss Function**: \[ L_{SSC}=\frac{1}{2}(\text{InfoN}

A Large Language Model Enhanced Sequential Recommender for Joint Video and Comment Recommendation

Large Language Model Can Interpret Latent Space of Sequential Recommender

LLM-ESR: Large Language Models Enhancement for Long-tailed Sequential Recommendation

LLaRA: Aligning Large Language Models with Sequential Recommenders.

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Leveraging Large Language Models in Conversational Recommender Systems

SLMRec: Empowering Small Language Models for Sequential Recommendation

Large Language Models Enhanced Collaborative Filtering

Harnessing Large Language Models for Text-Rich Sequential Recommendation

Enhancing Recommender Systems with Large Language Model Reasoning Graphs

ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation

Reformulating Sequential Recommendation: Learning Dynamic User Interest with Content-enriched Language Modeling

Leveraging Large Language Models for Pre-trained Recommender Systems

Online Video Recommendation Based on Multimodal Fusion and Relevance Feedback

Personalized Recommendation Systems Powered By Large Language Models: Integrating Semantic Understanding and User Preferences

LLMRec: Large Language Models with Graph Augmentation for Recommendation

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

Collaborative Cross-modal Fusion with Large Language Model for Recommendation

LIBER: Lifelong User Behavior Modeling Based on Large Language Models

A Large Language Model Enhanced Conversational Recommender System