LIBER: Lifelong User Behavior Modeling Based on Large Language Models

Chenxu Zhu,Shigang Quan,Bo Chen,Jianghao Lin,Xiaoling Cai,Hong Zhu,Xiangyang Li,Yunjia Xi,Weinan Zhang,Ruiming Tang
2024-11-22
Abstract:CTR prediction plays a vital role in recommender systems. Recently, large language models (LLMs) have been applied in recommender systems due to their emergence abilities. While leveraging semantic information from LLMs has shown some improvements in the performance of recommender systems, two notable limitations persist in these studies. First, LLM-enhanced recommender systems encounter challenges in extracting valuable information from lifelong user behavior sequences within textual contexts for recommendation tasks. Second, the inherent variability in human behaviors leads to a constant stream of new behaviors and irregularly fluctuating user interests. This characteristic imposes two significant challenges on existing models. On the one hand, it presents difficulties for LLMs in effectively capturing the dynamic shifts in user interests within these sequences, and on the other hand, there exists the issue of substantial computational overhead if the LLMs necessitate recurrent calls upon each update to the user sequences. In this work, we propose Lifelong User Behavior Modeling (LIBER) based on large language models, which includes three modules: (1) User Behavior Streaming Partition (UBSP), (2) User Interest Learning (UIL), and (3) User Interest Fusion (UIF). Initially, UBSP is employed to condense lengthy user behavior sequences into shorter partitions in an incremental paradigm, facilitating more efficient processing. Subsequently, UIL leverages LLMs in a cascading way to infer insights from these partitions. Finally, UIF integrates the textual outputs generated by the aforementioned processes to construct a comprehensive representation, which can be incorporated by any recommendation model to enhance performance. LIBER has been deployed on Huawei's music recommendation service and achieved substantial improvements in users' play count and play time by 3.01% and 7.69%.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two aspects: 1. **Problem of understanding long - life - cycle user behavior sequences**: When dealing with users' long - term behavior sequences, existing large - language models (LLMs) have difficulty effectively extracting valuable information from the text context for recommendation tasks. As the length of user behavior sequences increases, the performance of LLMs will decline significantly, even if the number of tokens in these behavior sequences is far below the context window limit of LLMs. Moreover, user behavior sequences in industrial environments are usually longer and may exceed the context window limit of LLMs, causing LLMs to be unable to effectively handle these long sequences. 2. **Problem of the inherent variability of user behavior**: Users' interests change over time, and this variability brings two major challenges to existing models: - It is difficult to capture the dynamic changes of users' interests because most existing LLM - enhancement methods tend to treat each item in the user behavior sequence equally without considering the timeliness of the items. - Whenever the user behavior sequence is updated, LLMs need to be re - executed, which brings huge computational costs. To solve the above problems, the paper proposes a large - language - model - based Lifelong user behavior modeling framework (LIBER), which improves the performance and efficiency of the recommendation system through three core modules (user behavior flow partitioning, user - interest learning, and user - interest fusion). Specifically, LIBER effectively alleviates the problems of understanding long - life - cycle user behavior sequences and excessive computational overhead by dividing users' long - term behavior sequences into short - term behavior caches and long - term behavior memories and applying LLMs to each long - term behavior memory partition. At the same time, LIBER introduces a cascading paradigm, considering the associations between different partitions to learn the evolution of users' interests.