Automated Disentangled Sequential Recommendation with Large Language Models
Xin Wang,Hong Chen,Zirui Pan,Yuwei Zhou,Chaoyu Guan,Lifeng Sun,Wenwu Zhu
DOI: https://doi.org/10.1145/3675164
IF: 4.657
2024-06-29
ACM Transactions on Information Systems
Abstract:Sequential recommendation aims to recommend the next items that a target user may have interest in based on the user’s sequence of past behaviors, which has become a hot research topic in both academia and industry. In the literature, sequential recommendation adopts a Sequence-to-Item or Sequence-to-Sequence training strategy, which supervises a sequential model with a user’s next one or more behaviors as the labels and the sequence of the past behaviors as the input. However, existing powerful sequential recommendation approaches employ more and more complex deep structures such as Transformer in order to accurately capture the sequential patterns, which heavily rely on hand-crafted designs on key attention mechanism to achieve state-of-the-art performance, thus failing to automatically obtain the optimal design of attention representation architectures in various scenarios with different data. Other works on classic automated deep recommender systems only focus on traditional settings, ignoring the problem of sequential scenarios. In this paper, we study the problem of automated sequential recommendation, which faces two main challenges: i) How can we design a proper search space tailored for attention automation in sequential recommendation, and ii) How can we accurately search effective attention representation architectures considering multiple user interests reflected in the sequential behavior. To tackle these challenges, we propose an automated disentangled sequential recommendation (AutoDisenSeq) model. In particular, we employ neural architecture search (NAS) and design a search space tailored for automated attention representation in attentive intention-disentangled sequential recommendation with an expressive and efficient space complexity of \(O(n^{2})\) given \(n\) as the number of layers. We further propose a context-aware parameter sharing mechanism taking characteristics of each sub-architecture into account to enable accurate architecture performance estimations and great flexibility for disentanglement of latent intention representation. Moreover, we propose AutoDisenSeq-LLM, which utilizes the textual understanding power of large language model (LLM) as a guidance to refine the candidate list for recommendation from AutoDisenSeq. We conduct extensive experiments to show that our proposed AutoDisenSeq model and AutoDisenSeq-LLM model outperform existing baseline methods on four real-world datasets in both overall recommendation and cold-start recommendation scenarios.
computer science, information systems