Hierarchical Meta-Reinforcement Learning for Resource-Efficient Slicing in O-RAN

Xianfu Chen,Celimuge Wu,Zhifeng Zhao,Yong Xiao,Shiwen Mao,Yusheng Ji
DOI: https://doi.org/10.1109/globecom54140.2023.10437350
2023-01-01
Abstract:Open radio access network (O-RAN) slicing allows the flexible control of network components and resources to satisfy the ever increasing demand of mobile applications. To optimize service provisioning, efficient management of limited radio resources is challenging due to the orchestration among network slices in the long-timescale and the slice configurations according to the mobile user (MU) statistics in the short-timescale. In this paper, we first propose a novel meta Markov decision process framework to mathematically formulate the problem of two-timescale radio resource management (RRM) in O-RAN slicing. The original RRM problem is then decoupled into a long-timescale master problem and a short-timescale subproblem, which are solved by a hierarchical reinforcement learning (RL) mechanism. Our proposed hierarchical RL mechanism includes a deep RL algorithm, solving the optimal long-timescale RRM policy, and a linear-decomposition based meta-RL algorithm, solving the optimal short-timescale RRM policy. Numerical experiments verify the theoretical analysis and show that our proposed hierarchical RL mechanism outperforms the most representative state-of-the-art baselines.
What problem does this paper attempt to address?