Abstract:Modern recommender systems are built upon computation-intensive infrastructure, and it is challenging to perform real-time computation for each request, especially in peak periods, due to the limited computational resources. Recommending by user-wise result caches is widely used when the system cannot afford a real-time recommendation. However, it is challenging to allocate real-time and cached recommendations to maximize the users' overall engagement. This paper shows two key challenges to cache allocation, i.e., the value-strategy dependency and the streaming allocation. Then, we propose a reinforcement prediction-allocation framework (RPAF) to address these issues. RPAF is a reinforcement-learning-based two-stage framework containing prediction and allocation stages. The prediction stage estimates the values of the cache choices considering the value-strategy dependency, and the allocation stage determines the cache choices for each individual request while satisfying the global budget constraint. We show that the challenge of training RPAF includes globality and the strictness of budget constraints, and a relaxed local allocator (RLA) is proposed to address this issue. Moreover, a PoolRank algorithm is used in the allocation stage to deal with the streaming allocation problem. Experiments show that RPAF significantly improves users' engagement under computational budget constraints.

What problem does this paper attempt to address?

This paper attempts to address the issue of cache allocation in large-scale recommendation systems, particularly how to maximize overall user engagement under limited computational resources. Specifically, the paper focuses on two key challenges: 1. **Value-Strategy Dependency**: Existing computational resource allocation methods assume that requests in different time periods are independent and that the value of computational resources is independent of the allocation strategy. However, these assumptions do not hold in the cache allocation problem. On one hand, the size of the result cache is limited, and if the system continuously recommends cached results to the same user, the cache will quickly be exhausted, and user experience will rapidly decline. On the other hand, the system's choice of whether to use the cache not only affects the user feedback of the current request but also influences the user's future behavior. Therefore, the value of the current cache choice also depends on future cache allocation strategies. 2. **Streaming Allocation**: Existing computational resource allocation methods typically allocate a batch of requests within each time period. However, requests in online recommendation systems arrive in a streaming manner, and the system needs to determine cache choices for each individual request as it arrives while satisfying global computational budget constraints. To address these challenges, the paper proposes a Reinforcement Prediction-Allocation Framework (RPAF). RPAF is a two-stage approach that includes a prediction stage and an allocation stage: - **Prediction Stage**: Uses reinforcement learning to estimate the value of different cache choices, considering value-strategy dependency. - **Allocation Stage**: Uses the estimated values for streaming allocation while satisfying global budget constraints. To handle the global and strict nature of budget constraints, the paper introduces a Relaxed Local Allocator (RLA), which transforms the constrained reinforcement learning problem into a computationally feasible form. Additionally, the paper proposes a PoolRank algorithm to handle the streaming allocation problem, ensuring that budget constraints are strictly met at each time step. Experimental validation shows that RPAF significantly improves user engagement under computational budget constraints.

RPAF: A Reinforcement Prediction-Allocation Framework for Cache Allocation in Large-Scale Recommender Systems

A Cluster-Based Incremental Recommendation Algorithm on Stream Processing Architecture

LPCA: Learned MRC Profiling based Cache Allocation for File Storage Systems

Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems

RL-MPCA: A Reinforcement Learning Based Multi-Phase Computation Allocation Approach for Recommender Systems

Efficient Cache Resource Aggregation Using Adaptive Multi-Level Exclusive Caching Policies

Computation Resource Allocation Solution in Recommender Systems

Cache-Enabled Dynamic Rate Allocation via Deep Self-Transfer Reinforcement Learning

Federated Distributed Deep Reinforcement Learning for Recommendation-enabled Edge Caching

Balancing Accuracy and Fairness for Interactive Recommendation with Reinforcement Learning

Learning-based Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation

RL-Cache: an Efficient Reinforcement Learning Based Cache Partitioning Approach for Multi-Tenant CDN Services

RBRA: A Simple and Efficient Rating-Based Recommender Algorithm to Cope with Sparsity in Recommender Systems

Long-term Recommender System Based on ACP Framework

DCAF: A Dynamic Computation Allocation Framework for Online Serving System

Toward Pareto Efficient Fairness-Utility Trade-off inRecommendation through Reinforcement Learning

AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement

Cache-Enhanced InBatch Sampling with Difficulty-Based Replacement Strategies for Learning Recommenders

CP-operated Dash Caching Via Reinforcement Learning.

Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning

PC-Allocation: Performance Cliff-Aware Two-Level Cache Resource Allocation Scheme for Storage System