Fast Policy Convergence for Traffic Engineering with Proactive Distributed Message-Passing
Zicheng Wang,Zirui Zhuang,Jingyu Wang,Qi,Haifeng Sun,Jianxin Liao
DOI: https://doi.org/10.1109/ipdps57955.2024.00074
2024-01-01
Abstract:Nowadays, the rise of various network applications makes network traffic become increasingly complex, which brings more stringent requirements to traffic engineering (TE). Although the state-of-the-art TE approaches based on deep reinforcement learning (DRL) or traditional methods can generate optimal solutions for fixed traffic matrices, they cannot converge fast enough to provide real-time optimization in real networks either because of excessive computation times or high communication overheads. Moreover, due to the dynamically changing traffic load on the network, it is also challenging to achieve optimization of maximum link utilization (MLU) and end-to-end delay at the same time since these two optimization objectives may be conflicting, especially when the network is under a low traffic load, which makes the modeling very difficult. To meet these challenges, we present RT-TE, a TE system based on DRL and distributed message-passing between intelligent agents that can achieve real-time optimization for both MLU and end-to-end delay. To reduce the communication time due to link propagation delay during the optimization process, we design a proactive message-passing mechanism that allows agents to use partial messages to compute the routing policy while maintaining the optimization performance. Additionally, to achieve the tradeoff between the two optimization objectives, we model the propagation delay into the DRL model and design a multi-objective training framework with parameter transfer for training. Based on theoretical modeling, we can find the best tradeoff between the two objectives. Moreover, to improve the model's generalization for various traffic flows, we use a GNN model to generate the rewards of the DRL model, which greatly speeds up the training phase and allows us to feed massive amounts of data into the model. Through evaluations of real-world network topologies, our approach shows a 10%-20% improvement in optimizing MLU under short traffic-changing intervals and yields a 9%-13% improvement in optimizing end-to-end delay compared to state-of-the-art approaches.