Abstract:Since the creation of the Web, recommender systems (RSs) have been an indispensable mechanism in information filtering. State-of-the-art RSs primarily depend on categorical features, which ecoded by embedding vectors, resulting in excessively large embedding tables. To prevent over-parameterized embedding tables from harming scalability, both academia and industry have seen increasing efforts in compressing RS embeddings. However, despite the prosperity of lightweight embedding-based RSs (LERSs), a wide diversity is seen in evaluation protocols, resulting in obstacles when relating LERS performance to real-world usability. Moreover, despite the common goal of lightweight embeddings, LERSs are evaluated with a single choice between the two main recommendation tasks -- collaborative filtering and content-based recommendation. This lack of discussions on cross-task transferability hinders the development of unified, more scalable solutions. Motivated by these issues, this study investigates various LERSs' performance, efficiency, and cross-task transferability via a thorough benchmarking process. Additionally, we propose an efficient embedding compression method using magnitude pruning, which is an easy-to-deploy yet highly competitive baseline that outperforms various complex LERSs. Our study reveals the distinct performance of LERSs across the two tasks, shedding light on their effectiveness and generalizability. To support edge-based recommendations, we tested all LERSs on a Raspberry Pi 4, where the efficiency bottleneck is exposed. Finally, we conclude this paper with critical summaries of LERS performance, model selection suggestions, and underexplored challenges around LERSs for future research. To encourage future research, we publish source codes and artifacts at \href{this link}{<a class="link-external link-https" href="https://github.com/chenxing1999/recsys-benchmark" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The paper aims to address a series of issues in the performance evaluation of Lightweight Embedded Recommendation Systems (LERSs). Specifically: 1. **Unified Evaluation Standards**: In current LERSs research, different studies adopt varying evaluation protocols, making it difficult to compare the actual effectiveness of different methods. The paper aims to solve this problem through systematic benchmarking. 2. **Transferability Between Tasks**: Although existing LERSs are primarily designed and evaluated for either content-based recommendation or collaborative filtering tasks, there is a lack of research on the performance of these methods in other tasks. The paper seeks to explore the generality and effectiveness of these methods across different tasks. 3. **Efficiency and Memory Consumption in Actual Deployment**: Besides the number of parameters and recommendation accuracy, metrics such as inference speed and runtime memory consumption in practical applications are also very important. However, existing research often overlooks these aspects. The paper reveals the resource requirements of different methods in actual deployment environments through benchmarking. In summary, the main goal of the paper is to conduct a comprehensive performance benchmark of various lightweight embedded recommendation systems, including their performance on different tasks, efficiency, and memory consumption, thereby providing valuable reference information for practical applications.

A Thorough Performance Benchmarking on Lightweight Embedding-based Recommender Systems

Learning Compact Compositional Embeddings via Regularized Pruning for Recommendation

Learnable Embedding Sizes for Recommender Systems.

Lightweight representation learning for efficient and scalable recommendation

LightRec: A Memory and Search-Efficient Recommender System.

Automated Embedding Size Search in Deep Recommender Systems

Divide and Conquer: Towards Better Embedding-based Retrieval for Recommender Systems From a Multi-task Perspective

Learning Recommender Systems with Implicit Feedback Via Soft Target Enhancement

Robust Training Objectives Improve Embedding-based Retrieval in Industrial Recommendation Systems

BARS: Towards Open Benchmarking for Recommender Systems

A Load Balanced Recommendation Approach

Embedding Compression in Recommender Systems: A Survey

LsRec: Large-scale social recommendation with online update

Single-shot Embedding Dimension Search in Recommender System

ERASE: Benchmarking Feature Selection Methods for Deep Recommender Systems

Continuous Input Embedding Size Search For Recommender Systems

RNE: A Scalable Network Embedding for Billion-scale Recommendation

Towards a More User-Friendly and Easy-to-Use Benchmark Library for Recommender Systems.

RBoard: A Unified Platform for Reproducible and Reusable Recommender System Benchmarks

RUEL: Retrieval-Augmented User Representation with Edge Browser Logs for Sequential Recommendation