A Prompting-Based Representation Learning Method for Recommendation with Large Language Models

Junyi Chen,Toyotaro Suzumura
2024-10-01
Abstract:In recent years, Recommender Systems (RS) have witnessed a transformative shift with the advent of Large Language Models (LLMs) in the field of Natural Language Processing (NLP). Models such as GPT-3.5/4, Llama, have demonstrated unprecedented capabilities in understanding and generating human-like text. The extensive information pre-trained by these LLMs allows for the potential to capture a more profound semantic representation from different contextual information of users and items. While the great potential lies behind the thriving of LLMs, the challenge of leveraging user-item preferences from contextual information and its alignment with the improvement of Recommender Systems needs to be addressed. Believing that a better understanding of the user or item itself can be the key factor in improving recommendation performance, we conduct research on generating informative profiles using state-of-the-art LLMs. To boost the linguistic abilities of LLMs in Recommender Systems, we introduce the Prompting-Based Representation Learning Method for Recommendation (P4R). In our P4R framework, we utilize the LLM prompting strategy to create personalized item profiles. These profiles are then transformed into semantic representation spaces using a pre-trained BERT model for text embedding. Furthermore, we incorporate a Graph Convolution Network (GCN) for collaborative filtering representation. The P4R framework aligns these two embedding spaces in order to address the general recommendation tasks. In our evaluation, we compare P4R with state-of-the-art Recommender models and assess the quality of prompt-based profile generation.
Information Retrieval
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper aims to address the issue of how to leverage large language models (LLMs) to enhance recommender systems (RS). Specifically, the paper focuses on the following aspects: 1. **User-Item Preference Extraction**: Traditional recommender systems often overlook the rich textual information (such as user reviews, occupations, item locations, categories, etc.) when handling user-item interaction data. This textual information can provide potential benefits to recommender systems. The paper proposes a method to generate informative item profiles using state-of-the-art LLMs, thereby better capturing user preferences. 2. **Semantic Representation Learning**: Although existing methods based on Graph Convolutional Networks (GCN) have achieved success in recommender systems, they mainly rely on user-item interaction matrices and ignore the richness of textual information. The paper proposes a Prompting-Based Representation Learning Method for Recommendation (P4R), which uses a pre-trained BERT model to convert the generated item profiles into a semantic representation space and performs collaborative filtering representation through GCN. 3. **Aligning Different Representation Spaces**: The P4R framework addresses the common problem in recommendation tasks by aligning the textual embedding space with the collaborative filtering representation space of GCN. This method not only improves the performance of recommender systems but is also particularly suitable for small companies or organizations with limited resources. ### Main Contributions 1. **Connecting GNN and LLMs**: By combining the GNN-based collaborative filtering recommendation framework with the open-source LLama-2-7b large language model, P4R can enhance the current item profile representations to improve performance. This method performs well in most lightweight GNN recommender systems. 2. **Using BERT for Text Embedding**: Utilizing the pre-trained BERT model architecture, the natural language generation task is connected with the general recommendation task by adjusting the embedding vectors of the generated profiles. Additionally, the LLM-enhanced embeddings are very beneficial for representation learning. 3. **Context-Aware Prompt Format**: A context-aware prompt format is proposed to generate informative item profiles. This method emphasizes that intrinsic and extrinsic textual information should be treated differently when using prompts. This approach demonstrates flexibility, handling various candidate textual information without extensive pre-training of large language models. 4. **Performance Evaluation**: P4R is compared with different state-of-the-art recommendation models to validate its performance. Additionally, the paper explores which contextual information is crucial for LLM-based recommenders and conducts ablation studies to analyze the impact of different designs on LLM-based recommender systems. ### Method Overview 1. **Auxiliary Feature Extraction**: By introducing textual information such as item names, categories, and locations, the understanding of user preferences is enhanced. The In-Context Learning (ICL) strategy is used to generate item profiles, significantly improving the performance of LLMs in adapting to many downstream tasks. 2. **Text Embedding and Representation**: Important item features are extracted from a large amount of contextual information through the generated item profiles. The pre-trained BERT model is used to learn the semantic representation of each important item profile. 3. **GNN-Based Alignment**: Through the neighbor aggregation mechanism of GCN, the textual embedding space is aligned with the collaborative filtering representation space of GCN, optimizing the representations of users and items to improve recommendation effectiveness. ### Experimental Results 1. **Overall Performance**: P4R outperforms other baseline models on multiple evaluation metrics (such as Recall, NDCG, MRR, and Hit Rate). In the best validation test results, P4R outperforms the best baseline model by 8.4% and 7.0% on Recall@10 and Recall@20 metrics, respectively, by 16.5% and 16.1% on NDCG@10 and NDCG@20 metrics, respectively, and by 21.2% and 21.0% on MRR@10 and MRR@20 metrics, respectively. 2.