Cache-Assisted Collaborative Task Offloading and Resource Allocation Strategy: A Metareinforcement Learning Approach.

Shiyou Chen,Lanlan Rui,Zhipeng Gao,Wenjing Li,Xuesong Qiu
DOI: https://doi.org/10.1109/jiot.2022.3168885
IF: 10.6
2022-01-01
IEEE Internet of Things Journal
Abstract:Multiaccess edge computing (MEC) provides users with better Quality of Experience (QoE) via offloading tasks to the nearby edge. However, the emergence of new Internet of Things applications with multiple tasks and repeated requests brings redundant computation and transmission to the edge. Meanwhile, the current offloading method based on deep reinforcement learning (DRL) has low sampling efficiency and slow convergence issues for training in a changing environment. Therefore, improving QoE of computation offloading services is still the ultimate challenge. In this article, we devise a collaboration of computing and cache resources among multiple edge nodes, which could reduce redundant computation and transmission. Specifically, we formulate a cache-assisted computation offloading process as a QoE-aware utility maximization problem based on multidimensional indicators. Then, we propose a cache-assisted collaborative task offloading and resource allocation strategy to solve it. This strategy is decomposed into two subproblems. First, to determine and obtain task cache state, we propose a collaborative task caching algorithm, which can improve the hit rate of tasks while balancing network overhead. Second, to acquire offloading and resource allocation decisions efficiently, we propose a metareinforcement learning-based cache-assisted computation offloading method (MCCOM), which can achieve rapid offloading decisions with a few gradient updates and samples. The optimization problem was transformed into multiple Markov decision processes (multiple MDPs). The improved learning process includes metapolicy learning that adapts to multiple Markov decision processes (MDPs) and policy learning for a specific MDP based on metapolicy. Simulation results show that our proposed method outperforms baselines in terms of QoE indicators while achieving rapid convergence and decisions.
What problem does this paper attempt to address?