AutoSAM: Towards Automatic Sampling of User Behaviors for Sequential Recommender Systems

Hao Zhang,Mingyue Cheng,Qi Liu,Zhiding Liu,Junzhe Jiang,Enhong Chen
2024-07-16
Abstract:Sequential recommender systems (SRS) have gained widespread popularity in recommendation due to their ability to effectively capture dynamic user preferences. One default setting in the current SRS is to uniformly consider each historical behavior as a positive interaction. Actually, this setting has the potential to yield sub-optimal performance, as each item makes a distinct contribution to the user's interest. For example, purchased items should be given more importance than clicked ones. Hence, we propose a general automatic sampling framework, named AutoSAM, to non-uniformly treat historical behaviors. Specifically, AutoSAM augments the standard sequential recommendation architecture with an additional sampler layer to adaptively learn the skew distribution of the raw input, and then sample informative sub-sets to build more generalizable SRS. To overcome the challenges of non-differentiable sampling actions and also introduce multiple decision factors for sampling, we further introduce a novel reinforcement learning based method to guide the training of the sampler. We theoretically design multi-objective sampling rewards including Future Prediction and Sequence Perplexity, and then optimize the whole framework in an end-to-end manner by combining the policy gradient. We conduct extensive experiments on benchmark recommender models and four real-world datasets. The experimental results demonstrate the effectiveness of the proposed approach. We will make our code publicly available after the acceptance.
Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use users' historical behavior data more effectively to improve recommendation performance in Sequential Recommender Systems (SRS). Specifically, existing SRS usually regard each historical behavior as a positive interaction of equal importance. This treatment may lead to sub - optimal recommendation results because different behaviors actually have different degrees of influence on users' interests. For example, purchased items should be more important than items that are only clicked. To solve this problem, the paper proposes a new automatic sampling framework - AutoSAM - for non - uniformly processing users' historical behaviors. AutoSAM can intelligently identify the skewed distribution of the original input by introducing an additional sampler and sample a representative subset from it to build a more generalized SRS. To address the challenge of non - differentiable discrete sampling actions and introduce multiple decision - making factors for sampling, the paper further designs a reinforcement - learning - based method to guide the training of the sampler. The paper proposes a multi - objective sampling reward mechanism, including Future Prediction and Sequence Perplexity, and optimizes the entire framework by combining policy gradients. Overall, this research aims to improve the recommendation accuracy and generalization ability of SRS by improving the data processing method, so as to better capture users' dynamic preferences.