Actor-Critic Deep Reinforcement Learning for Energy Minimization in UAV-Aided Networks

Yaxiong Yuan,Lei Lei,Thang X. Vu,Symeon Chatzinotas,Bjorn Ottersten
DOI: https://doi.org/10.1109/eucnc48522.2020.9200931
2020-01-01
Abstract:In this paper, we investigate a user-timeslot scheduling problem for downlink unmanned aerial vehicle (UAV)-aided networks, where the UAV serves as an aerial base station. We formulate an optimization problem by jointly determining user scheduling and hovering time to minimize UAV's transmission and hovering energy. An offline algorithm is proposed to solve the problem based on the branch and bound method and the golden section search. However, executing the offline algorithm suffers from the exponential growth of computational time. Therefore, we apply a deep reinforcement learning (DRL) method to design an online algorithm with less computational time. To this end, we first reformulate the original user scheduling problem to a Markov decision process (MDP). Then, an actor-critic-based RL algorithm is developed to determine the scheduling policy under the guidance of two deep neural networks. Numerical results show the proposed online algorithm obtains a good tradeoff between performance gain and computational time.
What problem does this paper attempt to address?