Multi-objective Optimization Service Function Chain Placement Algorithm Based on Reinforcement Learning
Hongtai Liu,Shengduo Ding,Shunyi Wang,Gang Zhao,Chao Wang
DOI: https://doi.org/10.1007/s10922-022-09673-5
2022-07-17
Journal of Network and Systems Management
Abstract:Network function virtualization (NFV) makes the realization of specific network functions no longer depend on inherent hardware by executing virtual network functions (VNFs), but realizes network functions in a more flexible programming manner, thereby reducing the pressure of resource allocation on the underlying network. Service function chain (SFC) is composed of a set of fixed order VNFs. These VNFs need to be deployed on appropriate physical nodes to meet user function requirements, i.e., the placement of SFC. Traditional solutions mostly use mathematical models or heuristic methods, which are not applicable in the context of large-scale networks. Secondly, the existing methods do not integrate intelligent learning algorithms into the service function chain placement (SFCP) problem, which limits the possibility of obtaining better solutions. This paper presents a multi-objective optimization service function chain placement (MOO-SFCP) algorithm based on reinforcement learning (RL). The goal of the algorithm is to optimize the resource allocation mode, including several performance indexes such as underlying resource consumption revenue, revenue cost ratio, VNF acceptance rate and network latency. We model the SFCP as a Markov decision process (MDP), and use a two-layer policy network as an intelligent agent. In the training stage of RL, the agent comprehensively considers the optimization objectives and formulates the optimal physical node mapping strategy for VNF requests. In the test phase, the whole SFCP is completed according to the trained node mapping strategy. Simulation results show that the algorithm proposed in this paper has excellent performance in the aspects of underlying resource allocation revenue, VNF acceptance rate and so on. In addition, we prove that the algorithm has good flexibility by changing the delay constraint.
computer science, information systems,telecommunications