When Deep Reinforcement Learning Meets Federated Learning: Intelligent Multitimescale Resource Management for Multiaccess Edge Computing in 5G Ultradense Network
Shuai Yu,Xu Chen,Zhi Zhou,Xiaowen Gong,Di Wu
DOI: https://doi.org/10.1109/jiot.2020.3026589
2021-01-01
Abstract:Recently, smart cities, healthcare system, and smart vehicles have raised challenges on the capability and connectivity of state-of-the-art Internet-of-Things (IoT) devices, especially for the devices in hotspots area. Multiaccess edge computing (MEC) can enhance the ability of emerging resource-intensive IoT applications and has attracted much attention. However, due to the time-varying network environments, as well as the heterogeneous resources of network devices, it is hard to achieve stable, reliable, and real-time interactions between edge devices and their serving edge servers, especially in the 5G ultradense network (UDN) scenarios. Ultradense edge computing (UDEC) has the potential to fill this gap, especially in the 5G era, but it still faces challenges in its current solutions, such as the lack of: 1) efficient utilization of multiple 5G resources (e.g., computation, communication, storage, and service resources); 2) low overhead offloading decision making and resource allocation strategies; and 3) privacy and security protection schemes. Thus, we first propose an intelligent UDEC (I-UDEC) framework, which integrates blockchain and artificial intelligence (AI) into 5G UDEC networks. Then, in order to achieve real-time and low overhead computation offloading decisions and resource allocation strategies, we design a novel two-timescale deep reinforcement learning (2Ts-DRL) approach, consisting of a fast-timescale and a slow-timescale learning process, respectively. The primary objective is to minimize the total offloading delay and network resource usage by jointly optimizing computation offloading, resource allocation, and service caching placement. We also leverage federated learning (FL) to train the 2Ts-DRL model in a distributed manner, aiming to protect the edge devices' data privacy. Simulation results corroborate the effectiveness of both the 2Ts-DRL and FL in the I-UDEC framework and prove that our proposed algorithm can reduce task execution time up to 31.87%.