Abstract:In this age where data is abundant, the ability to distill meaningful insights from the sea of information is essential. Our research addresses the computational and resource inefficiencies that current Sequential Recommender Systems (SRSs) suffer from. especially those employing attention-based models like SASRec, These systems are designed for next-item recommendations in various applications, from e-commerce to social networks. However, such systems suffer from substantial computational costs and resource consumption during the inference stage. To tackle these issues, our research proposes a novel method that combines automatic pruning techniques with advanced model architectures. We also explore the potential of resource-constrained Neural Architecture Search (NAS), a technique prevalent in the realm of recommendation systems, to fine-tune models for reduced FLOPs, latency, and energy usage while retaining or even enhancing accuracy. The main contribution of our work is developing the Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems (EASRec). This approach aims to find optimal compact architectures for attention-based SRSs, ensuring accuracy retention. EASRec introduces data-aware gates that leverage historical information from input data batch to improve the performance of the recommendation network. Additionally, it utilizes a dynamic resource constraint approach, which standardizes the search process and results in more appropriate architectures. The effectiveness of our methodology is validated through exhaustive experiments on three benchmark datasets, which demonstrates EASRec's superiority in SRSs. Our research set a new standard for future exploration into efficient and accurate recommender systems, signifying a substantial advancement within this swiftly advancing field.

DeepRecSys: A System for Optimizing End-To-End At-scale Neural Recommendation Inference

RecNMP: Accelerating Personalized Recommendation with Near-Memory Processing

NDRec: A Near-Data Processing System for Training Large-Scale Recommendation Models

Cross-Stack Workload Characterization of Deep Recommendation Systems

Exploiting Structured Feature and Runtime Isolation for High-Performant Recommendation Serving

MicroRec: Efficient Recommendation Inference by Hardware and Data Structure Solutions

NASRec: Weight Sharing Neural Architecture Search for Recommender Systems

DNS-Rec: Data-aware Neural Architecture Search for Recommender Systems

DisaggRec: Architecting Disaggregated Systems for Large-Scale Personalized Recommendation

Deep Learning Recommendation Model for Personalization and Recommendation Systems

CoRec: an Efficient Internet Behavior-based Recommendation Framework with Edge-cloud Collaboration on Deep Convolution Neural Networks.

RecNN: A Deep Neural Network based Recommendation System

Efficient Neural Matrix Factorization without Sampling for Recommendation

RecSSD: near data processing for solid state drive based recommendation inference

DeepRank: Learning to Rank with Neural Networks for Recommendation.

Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUs

DeepRec: An Open-source Toolkit for Deep Learning based Recommendation

AtRec: Accelerating Recommendation Model Training on CPUs

Distributed Recommendation Inference on FPGA Clusters

A deep learning-based hybrid model for recommendation generation and ranking

EASRec: Elastic Architecture Search for Efficient Long-term Sequential Recommender Systems