Joint Long-Term Processed Task and Communication Delay Optimization in UAV-Assisted MEC Systems Using DQN

Maryam Farajzadeh Dehkordi,Bijan Jabbari
2024-09-24
Abstract:Mobile Edge Computing (MEC) assisted by Unmanned Aerial Vehicle (UAV) has been widely investigated as a promising system for future Internet-of-Things (IoT) networks. In this context, delay-sensitive tasks of IoT devices may either be processed locally or offloaded for further processing to a UAV or to the cloud. This paper, by attributing task queues to each IoT device, the UAV, and the cloud, proposes a real-time resource allocation framework in a UAV-aided MEC system. Specifically, aimed at characterizing a long-term trade-off between the time-averaged aggregate processed data (PD) and the time-averaged aggregate communication delay (CD), a resource allocation optimization problem is formulated. This problem optimizes communication and computation resources as well as the UAV motion trajectory, while guaranteeing queue stability. To address this long-term time-averaged problem, a Lyapunov optimization framework is initially leveraged to obtain an equivalent short-term optimization problem. Subsequently, we reformulate the short-term problem in a Markov Decision Process (MDP) form, where a Deep Q Network (DQN) model is trained to optimize its variables. Extensive simulations demonstrate that the proposed resource allocation scheme improves the system performance by up to 36\% compared to baseline models.
Signal Processing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in the unmanned aerial vehicle - assisted mobile edge computing (MEC) system, how to optimize the trade - off between long - term processing tasks and communication latency. Specifically, the paper focuses on the task processing of Internet of Things (IoT) devices, and these tasks can be processed locally, offloaded to unmanned aerial vehicles or the cloud for further processing. To address this challenge, the authors propose a real - time resource allocation framework, aiming to maximize the long - term average amount of processed data (PD), while minimizing the long - term average communication latency (CD), and ensuring queue stability. ### Main Problem Description 1. **Long - term Trade - off between Task Processing and Communication Latency** - In the unmanned aerial vehicle - assisted MEC system, the task processing of IoT devices can be carried out locally, on unmanned aerial vehicles or in the cloud. - The authors introduce a new metric - processed data efficiency (PDE), which is defined as the ratio of the amount of processed data (PD) to the communication latency (CD): \[ \text{PDE}=\lim_{N\rightarrow\infty}\frac{\frac{1}{N}\sum_{n = 0}^{N - 1}\sum_{k = 1}^{K}B_{\text{Tot},k}[n]}{\frac{1}{N}\sum_{n = 0}^{N - 1}\sum_{k = 1}^{K}t_{\text{Comm},k}[n]} \] where \(B_{\text{Tot},k}[n]\) is the total amount of processed data of the \(k\) - th IoT device in the \(n\) - th time interval, and \(t_{\text{Comm},k}[n]\) is the corresponding communication latency. 2. **Resource Allocation Optimization Problem** - The paper formulates the resource allocation optimization problem as a long - term time - average problem and uses the Lyapunov optimization framework to transform it into a short - term optimization problem. - Then, the short - term problem is re - formulated as a Markov decision process (MDP), and its variables are optimized by training a deep Q - network (DQN) model. 3. **Efficient Decision - making in Dynamic Environments** - Traditional algorithms are often difficult to make quick decisions in dynamic environments and have high computational complexity. For this reason, the paper adopts the deep reinforcement learning (DRL) method, especially DQN, to achieve more efficient real - time resource allocation. ### Solutions - **Lyapunov Optimization Framework**: Used to handle long - term stability and resource allocation problems. - **Markov Decision Process (MDP)**: Model the short - term optimization problem as an MDP so as to use DQN for real - time decision - making. - **Deep Q - network (DQN)**: Through DQN model training, optimize the unmanned aerial vehicle trajectory, communication and computing resource allocation, thereby improving the overall performance of the system. ### Experimental Results - Simulation experiments show that the proposed resource allocation scheme improves the system performance by up to 36% compared with the baseline model. In summary, this paper mainly solves the problem of how to optimize the trade - off between task processing and communication latency in the long - term in the unmanned aerial vehicle - assisted MEC system, ensuring the stability and efficiency of the system.