DPU-Enhanced Multi-Agent Actor-Critic Algorithm for Cross-Domain Resource Scheduling in Computing Power Network

Shuaichao Wang,Shaoyong Guo,Jiakai Hao,Yinlin Ren,Feng Qi
DOI: https://doi.org/10.1109/tnsm.2024.3434997
2024-01-01
IEEE Transactions on Network and Service Management
Abstract:The distribution of computing resources in the Computing Power Network (CPN) is uneven, leading to an imbalance in resource supply and demand within domains, necessitating cross-domain resource scheduling. To address the cross-domain resource scheduling challenge in CPN, this paper presents an Improved Multi-Agent Actor-Critic (IMAAC) resource scheduling approach leveraging Data Processing Unit (DPU) offloading. Initially, we introduce a cross-domain resource scheduling architecture tailored for CPN by leveraging DPU offloading. Specifically, we delegate certain functionalities of the Multi-Agent Deep Reinforcement Learning (MADRL) Agent to DPUs, aiming to mitigate communication costs incurred during the generation of cross-domain scheduling decisions. Second, we introduce the parallel experience ensemble and multi-head attention mechanism in the Multi-Agent Actor-Critic (MAAC) framework to compress the state-space dimensionality of agent association across domains. Finally, we introduce the parallelized dual-policy network structure to mitigate training instability and convergence challenges within the actor and critic networks. Experimental results showcase that IMAAC achieves noteworthy reductions of 5.98% 13.56%, 23.54% 33.55%, and 41.17% 58.88% in total system delay, energy consumption, and the number of discarded tasks, respectively, compared to benchmark experiments.
What problem does this paper attempt to address?