Multi-objective deep reinforcement learning based time-frequency resource allocation for multi-beam satellite communications

Yuanzhi He,Biao Sheng,Hao Yin,Di Yan,Yingchao Zhang
DOI: https://doi.org/10.23919/jcc.2022.01.007
2022-01-01
China Communications
Abstract:Resource allocation is an important problem influencing the service quality of multi-beam satellite communications. In multi-beam satellite communications, the available frequency bandwidth is limited, users requirements vary rapidly, high service quality and joint allocation of multi-dimensional resources such as time and frequency are required. It is a difficult problem needs to be researched urgently for multi-beam satellite communications, how to obtain a higher comprehensive utilization rate of multidimensional resources, maximize the number of users and system throughput, and meet the demand of rapid allocation adapting dynamic changed the number of users under the condition of limited resources, with using an efficient and fast resource allocation algorithm. In order to solve the multi-dimensional resource allocation problem of multi-beam satellite communications, this paper establishes a multi-objective optimization model based on the maximum the number of users and system throughput joint optimization goal, and proposes a multi-objective deep reinforcement learning based time-frequency two-dimensional resource allocation (MODRL-TF) algorithm to adapt dynamic changed the number of users and the timeliness requirements. Simulation results show that the proposed algorithm could provide higher comprehensive utilization rate of multi-dimensional resources, and could achieve multi-objective joint optimization, and could obtain better timeliness than traditional heuristic algorithms, such as genetic algorithm (GA) and ant colony optimization algorithm (ACO).
telecommunications
What problem does this paper attempt to address?