Preliminary Study on Incremental Learning for Large Language Model-based Recommender Systems

Tianhao Shi,Yang Zhang,Zhijian Xu,Chong Chen,Fuli Feng,Xiangnan He,Qi Tian
2024-07-30
Abstract:Adapting Large Language Models for Recommendation (LLM4Rec) has shown promising results. However, the challenges of deploying LLM4Rec in real-world scenarios remain largely unexplored. In particular, recommender models need incremental adaptation to evolving user preferences, while the suitability of traditional incremental learning methods within LLM4Rec remains ambiguous due to the unique characteristics of Large Language Models (LLMs). In this study, we empirically evaluate two commonly employed incremental learning strategies (full retraining and fine-tuning) for LLM4Rec. Surprisingly, neither approach shows significant improvements in the performance of LLM4Rec. Instead of dismissing the role of incremental learning, we attribute the lack of anticipated performance enhancement to a mismatch between the LLM4Rec architecture and incremental learning: LLM4Rec employs a single adaptation module for learning recommendations, limiting its ability to simultaneously capture long-term and short-term user preferences in the incremental learning context. To test this speculation, we introduce a Long- and Short-term Adaptation-aware Tuning (LSAT) framework for incremental learning in LLM4Rec. Unlike the single adaptation module approach, LSAT utilizes two distinct adaptation modules to independently learn long-term and short-term user preferences. Empirical results verify that LSAT enhances performance, thereby validating our speculation. We release our code at: <a class="link-external link-https" href="https://github.com/TianhaoShi2001/LSAT" rel="external noopener nofollow">this https URL</a>.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to effectively perform incremental learning in the large - language - model - based recommendation system (LLM4Rec) to adapt to the changes of user preferences over time. Specifically, the research focuses on the following points: 1. **Evaluation of the effectiveness of existing methods**: - The author first empirically evaluated two common incremental learning strategies - full retraining and fine - tuning. The results show that these two methods do not significantly improve performance in LLM4Rec. 2. **Analysis of the root causes of the problem**: - The author speculates that this phenomenon may be due to the fact that the existing single LoRA module is difficult to capture both the long - term and short - term preferences of users simultaneously. Specifically: - Full retraining may focus more on long - term preferences because of the large amount of historical data. - Fine - tuning may lead to catastrophic forgetting, that is, new knowledge overwrites old knowledge, thus affecting the overall performance. 3. **Proposing solutions**: - To solve the above problems, the author proposes a new framework named Long - and Short - term Adaptation - aware Tuning (LSAT). This framework better adapts to the changes of user preferences by introducing two independent LoRA modules to learn long - term and short - term user preferences respectively and combining them in the inference stage. ### Formula summary - **Training objective of the short - term LoRA module**: \[ \min_{\Theta_t} L(D_t; \Phi, \Theta_t) \] where $\Phi$ represents the frozen pre - trained LLM parameters, and $L(D_t; \Phi, \Theta_t)$ is the recommendation loss based on the newly collected data $D_t$. - **Training objective of the long - term LoRA module**: \[ \min_{\Theta_h} L(H; \Phi, \Theta_h) \] where $H=\{D_1, D_2,\ldots, D_m\}$ represents sufficient historical data. - **Output fusion in the inference stage**: \[ \alpha f(x; \Phi, \Theta_h)+(1 - \alpha) f(x; \Phi, \Theta_t) \] where $\alpha$ is a hyperparameter, and $f(x; \Phi, \Theta_t)$ and $f(x; \Phi, \Theta_h)$ represent the prediction results using the short - term and long - term LoRA modules respectively. - **LoRA fusion**: \[ f(x; \Phi, \lambda\Theta_h+(1 - \lambda)\Theta_t) \] where $\lambda$ is a hyperparameter used to adjust the weights of the long - term and short - term LoRA module parameters. Through these methods, the author hopes to achieve more effective incremental learning in the large - language - model - based recommendation system, so as to better adapt to the dynamic changes of user preferences.