A Deep Reinforcement Learning Method Based on Attentional Memories

Libin Sun,Gao Biao,Haobin Shi
DOI: https://doi.org/10.1109/icceai55464.2022.00108
2022-01-01
Abstract:In continuous visual decision-making scenarios, the environment is often partially observable rather than globally observable, and traditional deep reinforcement learning methods are often unable to effectively learn to discover hidden information in partially observable environments. In this work, a continuous visual decision-making algorithm AMU-DQN for partially observable environments is proposed by combining the proposed attentional memories unit(AMU) which can integrate temporal and spatial features in historical sequences and deep Q-network. Due to the special mechanism of the AMU recurrent layer, the AMU-DQN algorithm has the ability to learn favorable hidden information from the historical observation information sequence in a partially observable environment, forming a special attention and memory ability. Extensive simulation experiments show that AMU-DQN exhibits super-high performance in both convergence speed and reward convergence peak.
What problem does this paper attempt to address?