Curriculum-scheduled Knowledge Distillation from Multiple Pre-trained Teachers for Multi-domain Sequential Recommendation

Wenqi Sun,Ruobing Xie,Junjie Zhang,Wayne Xin Zhao,Leyu Lin,Ji-Rong Wen
2024-10-15
Abstract:Pre-trained recommendation models (PRMs) have received increasing interest recently. However, their intrinsically heterogeneous model structure, huge model size and computation cost hinder their adoptions in practical recommender systems. Hence, it is highly essential to explore how to use different pre-trained recommendation models efficiently in real-world systems. In this paper, we propose a novel curriculum-scheduled knowledge distillation from multiple pre-trained teachers for multi-domain sequential recommendation, called CKD-MDSR, which takes full advantages of different PRMs as multiple teacher models to boost a small student recommendation model, integrating the knowledge across multiple domains from PRMs. Specifically, CKD-MDSR first adopts curriculum-scheduled user behavior sequence sampling and distills informative knowledge jointly from the representative PRMs such as UniSRec and Recformer. Then, the knowledge from the above PRMs are selectively integrated into the student model in consideration of their confidence and consistency. Finally, we verify the proposed method on multi-domain sequential recommendation and further demonstrate its universality with multiple types of student models, including feature interaction and graph based recommendation models. Extensive experiments on five real-world datasets demonstrate the effectiveness and efficiency of CKD-MDSR, which can be viewed as an efficient shortcut using PRMs in real-world systems.
Information Retrieval
What problem does this paper attempt to address?