Quantum Reinforcement Learning for Multi-Armed Bandits

Yi-Pei Liu,Kuo Li,Xi Cao,Qing-Shan Jia,Xu Wang
DOI: https://doi.org/10.23919/ccc55666.2022.9902595
2022-01-01
Abstract:This work focuses on the multi-armed bandits (MAB) problem and proposes a quantum reinforcement learning (RL) algorithm for action selection. Existing quantum RL algorithms generally assume that some prior information about the optimal action is known, and initial probability is set unequally. Our algorithm can be executed with equal initial probability on each action, and can greatly accelerate the learning process.
What problem does this paper attempt to address?