Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation

Lingzi Zhang,Xin Zhou,Zhiwei Zeng,Zhiqi Shen

2024-02-16

Abstract:Recent sequential recommendation models have combined pre-trained text embeddings of items with item ID embeddings to achieve superior recommendation performance. Despite their effectiveness, the expressive power of text features in these models remains largely unexplored. While most existing models emphasize the importance of ID embeddings in recommendations, our study takes a step further by studying sequential recommendation models that only rely on text features and do not necessitate ID embeddings. Upon examining pretrained text embeddings experimentally, we discover that they reside in an anisotropic semantic space, with an average cosine similarity of over 0.8 between items. We also demonstrate that this anisotropic nature hinders recommendation models from effectively differentiating between item representations and leads to degenerated performance. To address this issue, we propose to employ a pre-processing step known as whitening transformation, which transforms the anisotropic text feature distribution into an isotropic Gaussian distribution. Our experiments show that whitening pre-trained text embeddings in the sequential model can significantly improve recommendation performance. However, the full whitening operation might break the potential manifold of items with similar text semantics. To preserve the original semantics while benefiting from the isotropy of the whitened text features, we introduce WhitenRec+, an ensemble approach that leverages both fully whitened and relaxed whitened item representations for effective recommendations. We further discuss and analyze the benefits of our design through experiments and proofs. Experimental results on three public benchmark datasets demonstrate that WhitenRec+ outperforms state-of-the-art methods for sequential recommendation.

Information Retrieval

What problem does this paper attempt to address?

The problem this paper attempts to address is: how to improve recommendation performance in sequential recommendation by using only pre-trained text embeddings (without the need for ID embeddings). Specifically, existing sequential recommendation models typically combine pre-trained text embeddings and ID embeddings to achieve better recommendation results. However, the expressive power of text features in these models has not been fully explored, and most existing models emphasize the importance of ID embeddings. This paper focuses on exploring sequential recommendation models that rely solely on text features and finds that pre-trained text embeddings exhibit a high degree of anisotropy, which makes it difficult for recommendation models to effectively distinguish between different item representations, thereby affecting recommendation performance. To address this issue, the authors propose the following methods: 1. **Whitening Transformation**: By transforming pre-trained text embeddings into isotropic Gaussian distributions, the correlation between different dimensions is removed, significantly improving recommendation performance. 2. **WhitenRec+**: To benefit from the whitened text features while retaining some of the original text semantics, the authors introduce an integrated approach that combines fully whitened and partially whitened item representations, further enhancing recommendation performance. Experimental results show that WhitenRec+ outperforms existing state-of-the-art methods on three public benchmark datasets.

Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation

ID-centric Pre-training for Recommendation

Enhancing Sequential Recommendation Via LLM-based Semantic Embedding Learning

Integrating the Pre-trained Item Representations with Reformed Self-attention Network for Sequential Recommendation

Graph-Based Embedding Smoothing for Sequential Recommendation

Towards more effective encoders in pre-training for sequential recommendation

ID Embedding as Subtle Features of Content and Structure for Multimodal Recommendation

Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations

ID-Agnostic User Behavior Pre-training for Sequential Recommendation

Temporal Item Embedding With Static Similarity Regularization For Sequential Recommendation

A General Tail Item Representation Enhancement Framework for Sequential Recommendation

Long-Sequence Recommendation Models Need Decoupled Embeddings

Contextual MAB Oriented Embedding Denoising for Sequential Recommendation.

Sense-Based Topic Word Embedding Model for Item Recommendation.

Sequential Recommendation with Decomposed Item Feature Routing

Sequence-level Semantic Representation Fusion for Recommender Systems

SSDRec: Self-Augmented Sequence Denoising for Sequential Recommendation

Joint Text Embedding for Personalized Content-based Recommendation

Collaborative Word-based Pre-trained Item Representation for Transferable Recommendation

Exploiting Review Embedding and User Attention for Item Recommendation

Deep Bi-LSTM Networks for Sequential Recommendation