$\epsilon$-Invariant Hierarchical Reinforcement Learning for Building Generalizable Policy

Yihan Li,Tianren Zhang,Jinsheng Ren,Feng Chen
2023-01-01
Abstract:Goal-conditioned Hierarchical Reinforcement Learning (HRL) has shown remarkable potential for solving complex control tasks. However, existing methods struggle in tasks that require generalization since the learned subgoals are highly task-specific and therefore hardly reusable. In this paper, we propose a novel HRL framework called \textit{$\epsilon$-Invariant HRL} that uses abstract, task-agnostic subgoals reusable across tasks, resulting in a more generalizable policy. Although such subgoals are reusable, a transition mismatch problem caused by the inevitable incorrect value evaluation of subgoals can lead to non-stationary learning and even collapse. We mitigate this mismatch problem by training the high-level policy to be adaptable to the stochasticity manually injected into the low-level policy. As a result, our framework can leverage reusable subgoals to constitute a hierarchical policy that can effectively generalize to unseen new tasks. Theoretical analysis and experimental results in continuous control navigation tasks and challenging zero-shot generalization tasks show that our approach significantly outperforms state-of-the-art methods.
What problem does this paper attempt to address?