Traffic-driven Spectrum and Power Allocation Via Scalable Multi-Agent Reinforcement Learning

Yiming Zhang,Dongning Guo
DOI: https://doi.org/10.1109/allerton63246.2024.10735287
2024-01-01
Abstract:This paper introduces a novel traffic-driven approach to radio resource allocation in cellular networks by leveraging a fully scalable multi-agent reinforcement learning (MARL) framework. The objective is to minimize packets delay of links under stochastic arrivals, where access points (APs) make spectrum and power allocation decisions based on limited local information. Formulated as a distributed learning problem, we implement multi-agent proximal policy optimization (MAPPO) algorithm with recurrent neural networks and queueing dynamics to train flexible policies that map dynamic traffic and channel state information (CSI) to allocation decisions. The proposed MARL-based solution enables decentralized training and execution, ensuring scalability to large networks. Extensive simulations demonstrate that the proposed methods achieve comparable packet delay performance to genie-aided centralized algorithms while using only local information and reducing execution time. The trained policies also show scalability and robustness across various network sizes and traffic conditions.
What problem does this paper attempt to address?