Coverage Optimization for Large-Scale Mobile Networks with Digital Twin and Multi-Agent Reinforcement Learning

Haoqiang Liu,Tong Li,Fenyu Jiang,Weikang Su,Zhaocheng Wang
DOI: https://doi.org/10.1109/twc.2024.3464639
IF: 10.4
2024-01-01
IEEE Transactions on Wireless Communications
Abstract:With the exponential growth of mobile users, ensuring high-quality network coverage has become paramount. Large-scale mobile networks consist of numerous base stations (BSs), each with adjustable parameters such as angles and beam widths. Automatically optimizing network coverage can be difficult due to environmental factors and the interdependence of the adjustable parameters. Due to the inherent uncertainties and unpredictable nature of large-scale wireless networks, traditional methods such as heuristics and meta-heuristics lack the adaptability and scalability required to cope with their dynamic environment. To address these challenges, we propose utilizing digital twin and reinforcement learning (RL) techniques within mobile networks characterized by multiple collaborating agents. We initially introduce DT-SimNet, a digital twin-enabled mobile network simulator to facilitate optimization evaluation. DT-SimNet can efficiently simulate communication behaviors of network elements within a complex environment while revealing user mobility patterns. Moreover, to address challenges arising from multifaceted relationships among users, BSs, and the parameters across BSs, we introduce an innovative strategy named Optimized Multi-Agent Proximal Policy Optimization with Self-supervised Prediction (OMAPPO-SSP). Compared to MAPPO, which leads to limited applicability and inferior performance due to the dynamic characteristics of 5G networks, this approach leverages network structure optimization and a self-supervised prediction mechanism, employing multi-agent reinforcement learning (MARL) principles to enhance efficiency. By harnessing collaborative neural networks, OMAPPO-SSP facilitates the explicit learning of behavioral interactions among all BSs, enabling effective decision-making in environments characterized by intricate spatial relationships, dynamic user behaviors, and diverse interactions. Extensive experiments are conducted to validate the efficiency and effectiveness of the OMAPPO-SSP. Within the target area, OMAPPO-SSP achieves a coverage ratio of 94.66% and an average throughput of 89746 bits per second (bps), demonstrating significant improvements compared to competing methods.
What problem does this paper attempt to address?