Erlang Planning Network: an Iterative Model-Based Reinforcement Learning with Multi-Perspective

Jiao Wang,Lemin Zhang,Zhiqiang He,Can Zhu,Zihui Zhao
DOI: https://doi.org/10.1016/j.patcog.2022.108668
IF: 8
2022-01-01
Pattern Recognition
Abstract:For model-based reinforcement learning (MBRL), one of the key challenges is modeling error, which cripples the effectiveness of model planning and causes poor robustness during training. In this paper, we propose a bi-level Erlang Planning Network (EPN) architecture, which is composed of an upper-level agent and several multi-scale parallel sub-agents, trained in an iterative way. The proposed method focuses upon the expansion of representation by environment: a multi-perspective over the world model, which presents a varied way to represent an agent's knowledge about the world that alleviates the problem of falling into local optimal points and enhances robustness during the progress of model planning. Moreover, our experiments evaluate EPN on a range of continuous-control tasks in MuJoCo, the evaluation results show that the proposed framework finds exemplar solutions faster and consistently reaches the state-of-the-art performance.(c) 2022 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?