LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering

Nikita Severin,Aleksei Ziablitsev,Yulia Savelyeva,Valeriy Tashchilin,Ivan Bulychev,Mikhail Yushkov,Artem Kushneruk,Amaliya Zaryvnykh,Dmitrii Kiselev,Andrey Savchenko,Ilya Makarov
2024-11-01
Abstract:We present LLM-KT, a flexible framework designed to enhance collaborative filtering (CF) models by seamlessly integrating LLM (Large Language Model)-generated features. Unlike existing methods that rely on passing LLM-generated features as direct inputs, our framework injects these features into an intermediate layer of any CF model, allowing the model to reconstruct and leverage the embeddings internally. This model-agnostic approach works with a wide range of CF models without requiring architectural changes, making it adaptable to various recommendation scenarios. Our framework is built for easy integration and modification, providing researchers and developers with a powerful tool for extending CF model capabilities through efficient knowledge transfer. We demonstrate its effectiveness through experiments on the MovieLens and Amazon datasets, where it consistently improves baseline CF models. Experimental studies showed that LLM-KT is competitive with the state-of-the-art methods in context-aware settings but can be applied to a broader range of CF models than current approaches.
Information Retrieval,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: how to enhance the collaborative filtering (CF) model with the features generated by large - language models (LLMs) to improve the performance of the recommendation system. Specifically, existing methods usually directly pass the features generated by LLMs as input to the CF model. This method limits its applicability and can only be applied to specific context - aware models, but not to other types of CF models. To solve this problem, the authors propose a framework named "LLM - KT", which can embed the features generated by LLMs into the intermediate layer of the CF model, enabling the CF model to reconstruct and utilize these features internally without modifying the model architecture. This makes LLM - KT applicable to a wider range of CF models, and experiments on multiple benchmark datasets show that this method significantly improves the performance of the CF model and is competitive with existing state - of - the - art methods in context - aware scenarios. ### Main problem summary: 1. **Limitations of existing methods**: Existing knowledge transfer methods mainly rely on directly passing the features generated by LLMs as input to the CF model, which limits their application scope and can only be used for specific context - aware models. 2. **Objective**: Propose a general framework that can seamlessly integrate the features generated by LLMs into various CF models without modifying the model architecture, thereby expanding the scope of application of knowledge transfer and improving the performance of the recommendation system. ### Solutions: - **LLM - KT framework**: By embedding the features generated by LLMs into the intermediate layer of the CF model, the model can internally reconstruct these features, thereby enhancing the model's understanding of user preferences. - **Flexibility**: This framework is applicable to multiple CF models, including traditional models based on user - item interaction data and context - aware models. - **Experimental verification**: Through experiments on the MovieLens and Amazon datasets, the effectiveness and wide applicability of LLM - KT are proved. ### Formula representation: During the training process, a combined loss function \( L_{\text{combined}} \) is used, which is defined as follows: \[ L_{\text{combined}}=\alpha\cdot L_{\text{KT}}+(1 - \alpha)\cdot L_{\text{model}} \] where: - \( L_{\text{model}} \) is the specific loss function of the selected CF model (for example, binary cross - entropy loss or mean - squared - error loss). - \( L_{\text{KT}} \) is the knowledge transfer loss, which is defined as: \[ L_{\text{KT}}(Z_u, P_u)=L_{\text{reconstruct}}(Z_u,\text{Trans}(P_u)) \] Here, \( P_u \) is the profile embedding of user \( u \), \( Z_u \) is the output of the \( K \) - th layer of the CF model, and \(\text{Trans}\) is a transformation function used to align the profile embedding to the layer representation space of the model. In this way, the LLM - KT framework can effectively transfer the knowledge generated by LLMs to the CF model without changing the model architecture, thereby improving the performance of the recommendation system.