Multi-Agent DRL for Two-Timescale Bandwidth Allocation in Multi-Beam Satellite Networks

Liming Liang,Pengfei Duan,Gaofeng Cui,Weidong Wang
DOI: https://doi.org/10.1109/jiot.2024.3511672
IF: 10.6
2024-01-01
IEEE Internet of Things Journal
Abstract:Multi-beam technology, as one of the crucial technologies for implementing high throughput satellite (HTS) systems, can flexibly allocate the beam resources to enhance communication quality and transmission rates. Due to the variability in service demand, traditional resource allocation methods struggle to accommodate the dynamic changes in user service requirements. Thus, heuristic algorithms such as genetic algorithm (GA) and deep reinforcement learning (DRL) are commonly used in current research for communication resource allocation. However, existing resource allocation strategies mainly focus on radio resource allocation in a single timescale, neglecting the differentiated time granularity resource scheduling requirements arising from the distinct characteristics of variability in beam-level and user-level resource demands. In this paper, we investigate bandwidth allocation strategies in multi-beam satellite (MBS) networks. Considering the potential drawbacks of single-agent algorithms in terms of the high complexity for bandwidth allocation in dynamic user demands, a two-timescale hierarchical bandwidth allocation algorithm based on multi-agent DRL is proposed to minimize the total system delay and guarantee the delay fairness for ground users. Simulation results demonstrate that the proposed algorithm requires shorter training time to converge and exhibits superior performance in communication delay and system fairness compared with the single-agent DRL algorithm.
What problem does this paper attempt to address?