Abstract:Recommendation Systems have obtained huge attention with notion to assist users in determining their interests by prognosticating their ratings or preferences on specific item. Concurrently, the unique capability of RL (Reinforcement Learning) agent to learn from environment for reward without training the data makes it specifically suitable approach for such systems. Due to such ability, traditional works have considered DRL (Deep RL) for recommendation system. However, existing studies faced several challenges like scalability issues, probability for overlapping of numerous values and information loss while passing into a NN (Neural Network) and improper model training which lead to incorrect recommendations. Hence, this study intends to resolve these existing pitfalls. To accomplish this, the research proposes a DRR (DRL based Recommendation) framework in accordance with actor-critic learning. In actor network, DWL-FA (Deep Weighted Likelihood-Factor Analysis) is proposed for modifying the existing DNN (Deep Neural Network) to new-environment for compensating vector through the removal of unwanted regions in network results. Attention mechanism considered in this process affords decoder with suitable information from each hidden-states of the encoder. This attention mechanism along with DWL-FA model is further capable of selectively concentrating on valuable input sequences thereby effectively learning association amongst them. This assists the trained model to learn better. Subsequently, in critic network, HMP-WU (Hidden Markov Probability-Weight Updation) is proposed for optimizing the interactions amongst users with their preferences for the recommended items (environment) and recommender system (agent). In this case, weight updation process assists in comprehending related sequences thereby resolving incorrect predictions. These proposed processes have made the system explore better results with an increase of 5.74% with regard to average of p-value.

Q-ADER: An Effective Q-Learning for Recommendation With Diminishing Action Space

Pseudo Dyna-Q

Adaptive Learning Recommendation Strategy Based on Deep Q-learning

A Deep Reinforcement Learning Recommender System With Multiple Policies for Recommendations

Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation

Human-in-the-Loop Reinforcement Learning in Continuous-Action Space

Value Penalized Q-Learning for Recommender Systems

Deep Reinforcement Learning Framework for Category-Based Item Recommendation

Diversity-Aware Top-N Recommendation: A Deep Reinforcement Learning Way

Generative Adversarial User Model for Reinforcement Learning Based Recommendation System

Simplifying Deep Temporal Difference Learning

Handling Large-Scale Action Space In Deep Q Network

On Improving Deep Reinforcement Learning for POMDPs

A Q-learning Approach for Adherence-Aware Recommendations

Exploration and Regularization of the Latent Action Space in Recommendation

Time-Aware Q-Networks: Resolving Temporal Irregularity for Deep Reinforcement Learning

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

Intrinsically Motivated Reinforcement Learning Based Recommendation with Counterfactual Data Augmentation

DRL-HIFA: a dynamic recommendation system with deep reinforcement learning based Hidden Markov Weight Updation and factor analysis

QDAP: Downsizing adaptive policy for cooperative multi-agent reinforcement learning