Improving Deep Reinforcement Learning with Mirror Loss

Jian Zhao,Weide Shu,Youpeng Zhao,Wengang Zhou,Houqiang Li
DOI: https://doi.org/10.1109/tg.2022.3164470
IF: 1.237
2023-01-01
IEEE Transactions on Games
Abstract:Recent years have witnessed the great breakthrough of deep reinforcement learning (DRL) in various artificial intelligence applications, but the training process needs a very large amount of samples and huge computational costs. To alleviate the low sample efficiency issue, one feasible solution is to improve the state representation learning. We uncover that the agents, trained by the original DRL algorithm, face severe performance degradation in mirrored game environments. As mirror symmetry is an important property of the environment, the poor performance in the mirrored situation indicates that the agents are not fully aware of the essence of the environment. In order to handle this problem and make use of the property to attain better state representation, we propose a mirror loss, which serves as an auxiliary module to bring mirror symmetry representation to the DRL agent. It is model-agnostic and prompts the DRL agent to make logically consistent mirrored actions in the mirrored environment. We conduct experiments on OpenAI Gym Atari environments and a more complex reinforcement learning task, Mahjong AI, and the results demonstrate the efficiency and versatility of our method.
What problem does this paper attempt to address?