Abstract:Mobile-edge computing (MEC) has emerged as a promising computing paradigm in the 5G architecture, which can empower user equipments (UEs) with computation and energy resources offered by migrating workloads from UEs to the nearby MEC servers. Although the issues of computation offloading and resource allocation in MEC have been studied with different optimization objectives, they mainly focus on facilitating the performance in the quasistatic system, and seldomly consider time-varying system conditions in the time domain. In this article, we investigate the joint optimization of computation offloading and resource allocation in a dynamic multiuser MEC system. Our objective is to minimize the energy consumption of the entire MEC system, by considering the delay constraint as well as the uncertain resource requirements of heterogeneous computation tasks. We formulate the problem as a mixed-integer nonlinear programming (MINLP) problem, and propose a value iteration-based reinforcement learning (RL) method, named <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span> -Learning, to determine the joint policy of computation offloading and resource allocation. To avoid the curse of dimensionality, we further propose a double deep <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span> network (DDQN)-based method, which can efficiently approximate the value function of <span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="1.838ex" height="2.509ex" style="vertical-align: -0.671ex;" viewBox="0 -791.3 791.5 1080.4" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"> <use xlink:href="#MJMATHI-51" x="0" y="0"></use></g></svg></span> -learning. The simulation results demonstrate that the proposed methods significantly outperform other baseline methods in different scenarios, except the exhaustion method. Especially, the proposed DDQN-based method achieves very close performance with the exhaustion method, and can significantly reduce the average of 20%, 35%, and 53% energy consumption compared with offloading decision, local first method, and offloading first metho-, respectively, when the number of UEs is 5.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"><path stroke-width="1" id="MJMATHI-51" d="M399 -80Q399 -47 400 -30T402 -11V-7L387 -11Q341 -22 303 -22Q208 -22 138 35T51 201Q50 209 50 244Q50 346 98 438T227 601Q351 704 476 704Q514 704 524 703Q621 689 680 617T740 435Q740 255 592 107Q529 47 461 16L444 8V3Q444 2 449 -24T470 -66T516 -82Q551 -82 583 -60T625 -3Q631 11 638 11Q647 11 649 2Q649 -6 639 -34T611 -100T557 -165T481 -194Q399 -194 399 -87V-80ZM636 468Q636 523 621 564T580 625T530 655T477 665Q429 665 379 640Q277 591 215 464T153 216Q153 110 207 59Q231 38 236 38V46Q236 86 269 120T347 155Q372 155 390 144T417 114T429 82T435 55L448 64Q512 108 557 185T619 334T636 468ZM314 18Q362 18 404 39L403 49Q399 104 366 115Q354 117 347 117Q344 117 341 117T337 118Q317 118 296 98T274 52Q274 18 314 18Z"></path></defs></svg>

RMDDQN-Learning: Computation Offloading Algorithm Based on Dynamic Adaptive Multi -Objective Reinforcement Learning in Internet of Vehicles

Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning

An Efficient Online Computation Offloading Approach for Large-Scale Mobile Edge Computing via Deep Reinforcement Learning

Multi-mobile vehicles task offloading for vehicle-edge-cloud collaboration: A dependency-aware and deep reinforcement learning approach

Deep Reinforcement Learning-Based Offloading Decision Optimization in Mobile Edge Computing

Cloud-Edge–End Collaborative Task Offloading in Vehicular Edge Networks: A Multilayer Deep Reinforcement Learning Approach

Multi-Queue-Based Offloading Strategy for Deep Reinforcement Learning Tasks

Delay-aware and Energy-Efficient Computation Offloading in Mobile Edge Computing Using Deep Reinforcement Learning

Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning Approach

Deep Reinforcement Learning for Energy-Efficient Computation Offloading in Mobile-Edge Computing

Joint Optimization for MEC Computation Offloading and Resource Allocation in IoV Based on Deep Reinforcement Learning

DQN-based mobile edge computing for smart Internet of vehicle

Deep Reinforcement Learning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks

DMADRL: A Distributed Multi-agent Deep Reinforcement Learning Algorithm for Cognitive Offloading in Dynamic MEC Networks

Computation offloading strategy based on deep reinforcement learning for connected and autonomous vehicle in vehicular edge computing

DRL-based Task and Computational Offloading for Internet of Vehicles in Decentralized Computing

Computation Offloading and Resource Allocation in NOMA-MEC: A Deep Reinforcement Learning Approach

Lyapunov-Guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Deep reinforcement learning based adaptive threshold multi-tasks offloading approach in MEC

NOMA-Based Multi-User Mobile Edge Computation Offloading via Cooperative Multi-Agent Deep Reinforcement Learning

Joint Computation Offloading and Resource Allocation for Edge-Cloud Collaboration in Internet of Vehicles via Deep Reinforcement Learning