CDARL: a contrastive discriminator-augmented reinforcement learning framework for sequential recommendations

Zhuang Liu,Yunpu Ma,Marcel Hildebrandt,Yuanxin Ouyang,Zhang Xiong
DOI: https://doi.org/10.1007/s10115-022-01711-7
IF: 2.7
2022-07-17
Knowledge and Information Systems
Abstract:Sequential recommendations play a crucial role in many real-world applications. Due to the sequential nature, reinforcement learning has been employed to iteratively produce recommendations based on an observed stream of user behavior. In this setting, a recommendation agent interacts with the environments (users) by sequentially recommending items (actions) to maximize users' overall long-term cumulative rewards. However, most reinforcement learning-based recommendation models only focus on extrinsic rewards based on user feedback, leading to sub-optimal policies if user-item interactions are sparse and fail to obtain the dynamic rewards based on the users' preferences. As a remedy, we propose a dynamic intrinsic reward signal integrated with a contrastive discriminator-augmented reinforcement learning framework. Concretely, our framework contains two modules: (1) a contrastive learning module is employed to learn the representation of item sequences; (2) an intrinsic reward learning function to imitate the user's internal dynamics. Furthermore, we combine static extrinsic reward and dynamic intrinsic reward to train a sequential recommender system based on double Q-learning. We integrate our framework with five representative sequential recommendation models. Specifically, our framework augments these recommendation models with two output layers: the supervised layer that applies cross-entropy loss to perform ranking and the other for reinforcement learning. Experimental results on two real-world datasets demonstrate that the proposed framework outperforms several sequential recommendation baselines and exploration with intrinsic reward baselines.
computer science, information systems, artificial intelligence
What problem does this paper attempt to address?