Towards Efficient Long-Horizon Decision-Making Using Automated Structure Search Method of Hierarchical Reinforcement Learning for Edge Artificial Intelligence

Guanlin Wu,Weidong Bao,Jiang Cao,Xiaomin Zhu,Ji Wang,Wenhua Xiao,Wenqian Liang
DOI: https://doi.org/10.1016/j.iot.2023.100951
IF: 5.711
2023-01-01
Internet of Things
Abstract:Hierarchical reinforcement learning (HRL) is a promising approach for efficiently solving various long-horizon decision-making tasks in the Internet of Things (IoT) domain. However, HRL algorithms are known to rely on expert knowledge to preset an appropriate hierarchical structure for different IoT tasks, which leads to higher trial costs and limits its wider application. In this paper, we propose a new method called DHRL (Dynamic-Level Hierarchical Reinforcement Learning) and it is able to adaptively search for the optimal hierarchical structure while maintaining the generality of framework design. DHRL incorporates an embedded exploration and exploitation mechanism that effectively solves the challenges caused by dependence between different levels and achieves a balance between maximizing benefits and current evaluation accuracy. Nonetheless, the more exploration processes inevitably has a negative impact on the performance. To mitigate this influences, we propose a synchronous training architecture to support DHRL operating in a distributed and parallel manner, in which the adaptive evolutionary method is also introduced to accelerate the convergence. Extensive experimental evaluations are conducted to demonstrate the effectiveness of our theory and method.
What problem does this paper attempt to address?