Meta-Reinforcement Learning Algorithm Based on Reward and Dynamic Inference

Jinhao Chen,Chunhong Zhang,Zheng Hu
DOI: https://doi.org/10.1007/978-981-97-2259-4_17
2024-01-01
Abstract:Meta-Reinforcement Learning aims to rapidly address unseen tasks that share similar structures. However, the agent heavily relies on a large amount of experience during the meta-training phase, presenting a formidable challenge in achieving high sample efficiency. Current methods typically adapt to novel tasks within the Meta-Reinforcement Learning framework through task inference. Unfortunately, these approaches still exhibit limitations when faced with high-complexity task space. In this paper, we propose a Meta-Reinforcement Learning method based on reward and dynamic inference. We introduce independent reward and dynamic inference encoders, which sample specific context information to capture the deep-level features of task goals and dynamics. By reducing task inference space, agent effectively learns the shared structures across tasks and acquires a profound understanding of the task differences. We illustrate the performance degradation caused by the high task inference complexity and demonstrate that our method outperforms previous algorithms in terms of sample efficiency.
What problem does this paper attempt to address?