Deep Reinforcement Learning for Dynamic Bandwidth Allocation in Multi-Beam Satellite Systems

Shijun Ma,Xin Hu,Xianglai Liao,Weidong Wang
DOI: https://doi.org/10.1109/icccs52626.2021.9449160
2021-01-01
Abstract:Future multi-beam satellite (MBS) network is an essential part of the air-space-ground integrated network, which is the future blueprint of 6G. As the MBS network scales up, how to allocation scarce bandwidth spectrum resources efficiently and dynamically while ensuring the Quality of Service (QoS) of the users has become a great challenge. In this paper, we designed a dynamic bandwidth allocation framework using Proximal Policy Optimization (DBA-PPO) to meet the time-varying traffic demand, maximize utilization and guarantee the QoS of the users in the MBS system. The experimental results show that the proposed bandwidth allocation algorithm can be flexible to achieve the desired effectiveness with low complexity and is more cost-effective for the large scale MBS communications scenario.
What problem does this paper attempt to address?