Abstract:Real-world recommendation systems commonly offer diverse content scenarios for users to interact with. Considering the enormous number of users in industrial platforms, it is infeasible to utilize a single unified recommendation model to meet the requirements of all scenarios. Usually, separate recommendation pipelines are established for each distinct scenario. This practice leads to challenges in comprehensively grasping users' interests. Recent research endeavors have been made to tackle this problem by pre-training models to encapsulate the overall interests of users. Traditional pre-trained recommendation models mainly capture user interests by leveraging collaborative signals. Nevertheless, a prevalent drawback of these systems is their incapacity to handle long-tail items and cold-start scenarios. With the recent advent of large language models, there has been a significant increase in research efforts focused on exploiting LLMs to extract semantic information for users and items. However, text-based recommendations highly rely on elaborate feature engineering and frequently fail to capture collaborative similarities. To overcome these limitations, we propose a novel pre-training framework for sequential recommendation, termed PRECISE. This framework combines collaborative signals with semantic information. Moreover, PRECISE employs a learning framework that initially models users' comprehensive interests across all recommendation scenarios and subsequently concentrates on the specific interests of target-scene behaviors. We demonstrate that PRECISE precisely captures the entire range of user interests and effectively transfers them to the target interests. Empirical findings reveal that the PRECISE framework attains outstanding performance on both public and industrial datasets.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in industrial - level recommendation systems, how to effectively capture users' comprehensive interests in multiple recommendation scenarios and apply them to recommendation tasks in specific scenarios. Specifically, the paper proposes solutions to the following problems: 1. **Challenges in multi - scenario user - interest modeling**: - Industrial - level recommendation systems usually establish independent recommendation pipelines for each different recommendation scenario, which makes it difficult to fully grasp users' interests. - Traditional pre - training models mainly rely on collaborative signals, but perform poorly when dealing with long - tail items and cold - start scenarios. 2. **Challenges in combining collaborative signals and semantic information**: - Existing methods based on large language models (LLMs) can extract semantic information, but are highly dependent on complex feature engineering and difficult to capture collaborative similarity. To overcome these problems, the paper proposes a new pre - training framework named PRECISE, which combines collaborative signals and semantic information, aiming to more comprehensively capture users' interests and effectively transfer them to tasks in specific scenarios. ### Main contributions - **Proposing the PRECISE framework**: Capture users' comprehensive interests in all scenarios through universal training, and transfer these interests to tasks in specific scenarios through targeted training. - **Fusing ID embedding and semantic embedding**: Use large language models to generate semantic embeddings of items, and combine ID embeddings for representation, and adopt the Mixture of Experts (MoE) structure to balance the importance of different embeddings. - **Practical training and deployment experiences**: The scalability and effectiveness of the PRECISE framework in offline datasets and online experiments are verified through extensive experiments. ### Formula summary - **Formula 1**: Optimization objective for predicting the next interaction item \[ \arg \max_{\hat{i} \in I} P(i_{u,n + 1}=\hat{i}|S_u) \] - **Formula 2**: Generation of semantic embeddings \[ x_i=\text{LLM}(T_i) \] - **Formula 3**: Gating network calculation of the MoE module \[ \text{gate}(x_i)=\text{softmax}(\text{topk}(x_i\cdot W_{\text{gate}})) \] - **Formula 4**: Calculation of the final item embedding \[ e_i=\text{ID}(i)\oplus\sum_{j = 1}^K(\text{gate}(x_i)_j\cdot\text{Attn}_j(x_i)) \] - **Formula 5**: Next - item Prediction loss function \[ L_{\text{nip}}=-\sum_{u\in U, i\in S_u}\log\left(\frac{\exp(h^H_{u,i}\cdot e_{i + 1})}{\exp(h^H_{u,i}\cdot e_{i + 1})+\sum_{j\in N_u, j\neq i + 1}\exp(h^H_{u,i}\cdot e_j)}\right) \] - **Formula 6**: Bayesian Personalized Ranking loss function \[ L_{\text{bpr}}=-\sum_{u\in U}\sum_{u'\in U, u'\neq u}\log\left(\sigma(h'_u\cdot e_{u,n + 1}-h'_{u'}\cdot e_{u,n + 1})\right) \] Through the above methods, the PRECISE framework can more comprehensively capture users' interests.

PRECISE: Pre-training Sequential Recommenders with Collaborative and Semantic Information

Semantic-Enhanced Personalized Recommender System

Collaborative Semantic Alignment in Recommendation Systems

MISSRec: Pre-training and Transferring Multi-modal Interest-aware Sequence Representation for Recommendation

Recognizing Semantics-Consistent Subsequences for Sequential Recommendation

SPAR: Personalized Content-Based Recommendation via Long Engagement Attention

Contrastive Pre-training for Sequential Recommendation

When Search Meets Recommendation: Learning Disentangled Search Representation for Recommendation

UPRec: User-aware Pre-training for Sequential Recommendation

Empowering Sequential Recommendation from Collaborative Signals and Semantic Relatedness

Reformulating Sequential Recommendation: Learning Dynamic User Interest with Content-enriched Language Modeling

SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation

SCRIPT: Sequential Cross-Meta-Information Recommendation in Pretrain and Prompt Paradigm

ControlRec: Bridging the Semantic Gap between Language Model and Personalized Recommendation

Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation

Adapting Large Language Models by Integrating Collaborative Semantics for Recommendation

Thoroughly Modeling Multi-domain Pre-trained Recommendation as Language

ID-centric Pre-training for Recommendation

Sequence-level Semantic Representation Fusion for Recommender Systems

Enhancing Sequential Recommendation Via LLM-based Semantic Embedding Learning

Intention-Aware Sequential Recommendation with Structured Intent Transition