A Scalable Mean-Field MARL Framework for Multi-Objective V2X Resource Allocation
Xuan Zhang,Hengxi Zhang,Huaze Tang,Le Liang,Ling Cheng,Xinlei Chen,Wenbo Ding,Xiao-Ping Zhang
DOI: https://doi.org/10.1109/tiv.2024.3422506
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Resource allocation in dense vehicle-to-everything (V2X) communication networks poses intricate challenges due to scalability issues, dynamic environments, and diverse quality of service requirements. Traditional optimizations like genetic algorithms struggle with computational complexity and dynamic planning. Meanwhile, pure multi-agent reinforcement learning (MARL)—a type of machine learning where multiple agents learn to optimize their decisions through interactions with each other and the environment—faces the curse of dimensionality and communication overhead, which limits the large-scale deployment of dynamic vehicular networks. This paper proposes an innovative approach to joint spectrum allocation and power control within dense V2X networks, conceptualizing it as an MARL-based multi-objective optimization problem. To mitigate computational demands, we adopt a hybrid approach combining centralized-training-with-decentralized-execution paradigm, alongside parameter-sharing techniques, within classic Mean-Field MARL (MF-MARL) framework, and name it Scalable-V2X-MF-MARL (ScalV-MF-MARL), enabling a lightweight training process in dense V2X networks with numerous V2V agents. It also technically encodes observations and mean field of actions, thereby enhancing the model's dynamism and scalability. This advancement permits the application of a single model across varying vehicle densities. Experimental results demonstrate that ScalV-MF-MARL achieves 99.5% of the performance of density-specific MF-MARL models, while reducing GPU memory usage by 79.49% to 94.03% during training as vehicle number increases from 40 to 160. Additionally, it outstrips conventional algorithms. Its generalization capabilities across diverse V2X network densities facilitate training in less dense scenarios, with seamless application to denser networks. In conclusion, ScalV-MF-MARL streamlines the V2X network deployment and effectively handles dynamic changes in vehicle numbers.