Multi-Agent Reinforcement Learning Aided Resource Allocation with SARSA in UAV Networks

Nawaf Qasem Hamood Othman,Jinglei Li,Qinghai Yang
DOI: https://doi.org/10.1109/icspcc59353.2023.10400208
2023-01-01
Abstract:This article rigorously explores the self-directed allocation of resources in communication networks powered by multiple Unmanned Aerial Vehicles (UAVs), aimed at optimizing long-term gains. To model the complexities of dynamic and unpredictable environments, we formulate the challenge of long-term resource allocation as a stochastic game. Our primary objective is to maximize expected rewards, with each UAV functioning as a learning agent and each resource allocation solution corresponding to an action executed by the UAVs within a Multi-Agent Reinforcement Learning (MARL) framework. Moreover, we introduce an agent-agnostic approach, where all agents independently implement a decision algorithm, yet maintain a shared structure based on SARSA (State-Action- Reward-State-Action). Our simulations demonstrate that the proposed algorithm exhibits a rarity that is commendable, particularly when compared to scenarios that demand exhaustive information exchange among the UAVs.
What problem does this paper attempt to address?