Exploiting Hybrid Policy in Reinforcement Learning for Interpretable Temporal Logic Manipulation

Hao Zhang,Hao Wang,Xiucai Huang,Wenrui Chen,Zhen Kan
DOI: https://doi.org/10.1109/IROS58592.2024.10802202
2024-12-29
Abstract:Reinforcement Learning (RL) based methods have been increasingly explored for robot learning. However, RL based methods often suffer from low sampling efficiency in the exploration phase, especially for long-horizon manipulation tasks, and generally neglect the semantic information from the task level, resulted in a delayed convergence or even tasks failure. To tackle these challenges, we propose a Temporal-Logic-guided Hybrid policy framework (HyTL) which leverages three-level decision layers to improve the agent's performance. Specifically, the task specifications are encoded via linear temporal logic (LTL) to improve performance and offer interpretability. And a waypoints planning module is designed with the feedback from the LTL-encoded task level as a high-level policy to improve the exploration efficiency. The middle-level policy selects which behavior primitives to execute, and the low-level policy specifies the corresponding parameters to interact with the environment. We evaluate HyTL on four challenging manipulation tasks, which demonstrate its effectiveness and interpretability. Our project is available at: <a class="link-external link-https" href="https://sites.google.com/view/hytl-0257/" rel="external noopener nofollow">this https URL</a>.
Robotics,Machine Learning
What problem does this paper attempt to address?