Abstract:With the expansion of the very-high-throughput satellite (VHTS) system, the uneven distribution of traffic demands in time and space has become increasingly significant and cannot be ignored. It is a significant challenge to efficiently and dynamically allocate scarce on-board resources to ensure capacity and demand matching. The advancement of flexible payload technology provides the possibility to overcome this challenge. However, computational complexity is increasing due to the unsynchronized resource adjustment and the time-varying demands of the VHTS system. Therefore, we propose a double-timescale bandwidth and power allocation (DT-BPA) scheme to effectively manage the available resources in the flexible payload architecture. We use a multi-agent deep reinforcement learning (MADRL) algorithm aiming to meet the time-varying traffic demands of each beam and improve resource utilization. The simulation results demonstrate that the proposed DT-BPA algorithm enhanced the matching degree of capacity and demand as well as reduced the system's power consumption. Additionally, it can be trained offline and implemented online, providing a more cost-effective solution for the VHTS system.

Double-Timescale Multi-Agent Deep Reinforcement Learning for Flexible Payload in VHTS Systems