Automatic Selection of Security Service Function Chaining Using Reinforcement Learning

Guanglei Li,Huachun Zhou,Bohao Feng,Guanwen Li,Shui Yu
DOI: https://doi.org/10.1109/GLOCOMW.2018.8644122
2018-01-01
Abstract:When selecting security Service Function Chaining (SFC) for network defense, operators usually take security performance, service quality, deployment cost, and network function diversity into consideration, formulating as a multi-objective optimization problem. However, as applications, users, and data volumes grow massively in networks, traditional mathematical approaches cannot be applied to online security SFC selections due to high execution time and uncertainty of network conditions. Thus, in this paper, we utilize reinforcement learning, specifically, the Q-learning algorithm to automatically choose proper security SFC for various requirements. Particularly, we design a reward function to make a tradeoff among different objectives and modify the standard ∊-greedy based exploration to pick out multiple ranked actions for diversified network defense. We compare the Q-learning with mathematical optimization-based approaches, which are assumed to know network state changes in advance. The training and testing results show that the Q-learning based approach can capture changes of network conditions and make a tradeoff among different objectives.
What problem does this paper attempt to address?