Deep Reinforcement Learning Based Computation Offloading in Heterogeneous MEC Assisted by Ground Vehicles and Unmanned Aerial Vehicles

Hang He,Tao Ren,Meng Cui,Dong Liu,Jianwei Niu
DOI: https://doi.org/10.1007/978-3-031-19211-1_40
2022-01-01
Abstract:Compared with traditional mobile edge computing (MEC), heterogeneous MEC (H-MEC), which is assisted by ground vehicles (GVs) and unmanned aerial vehicles (UAVs) simultaneously, is attracting more and more attention from both academia and industry. By deploying base stations (along with edge servers) on GVs or UAVs, H-MEC is more suitable for access-demand dynamicallychanging network environments, e.g., sports matches, traffic management, and emergency rescue. However, it is non-trivial to perform real-time user association and resource allocation in large-scale H-MEC environments. Motivated by this, we propose a shared multi-agent proximal policy optimization (SMAPPO) algorithm based on the centralized training and distributed execution framework. Due to the NP-hard difficulty of jointly optimizing user association and resource allocation for H-MEC, we adopt the actor-critic-based online-policy gradient (PG) algorithm to obtain near-optimal solutions with low scheduling complexities. In addition, considering the low sampling efficiency of PG, we introduce proximal policy optimization to increase the training efficiency by importance sampling. Moreover, we leverage the idea of centralized training and distributed execution to improve the training efficiency and reduce scheduling complexity, so that each mobile device makes decisions based only on local observation and learns other MDs' experience from a shared replay buffer. Extensive simulation results demonstrate that SMAPPO can achieve more satisfactory performances than traditional algorithms.
What problem does this paper attempt to address?