A Reinforcement Learning-Based Network Traffic Prediction Mechanism in Intelligent Internet of Things

Laisen Nie,Zhaolong Ning,Mohammad S. Obaidat,Balqies Sadoun,Huizhi Wang,Shengtao Li,Lei Guo,Guoyin Wang
DOI: https://doi.org/10.1109/tii.2020.3004232
IF: 12.3
2021-03-01
IEEE Transactions on Industrial Informatics
Abstract:Intelligent Internet of Things (IIoT) is comprised of various wireless and wired networks for industrial applications, which makes it complex and heterogeneous.The openness of IIoT has led to the intractable problems of network security and management. Many network security and management functions rely on network traffic prediction techniques, such as anomaly detection and predictive network planning. Predicting IIoT network traffic is significantly difficult because its frequently updated topology and diversified services lead to irregular network traffic fluctuations. Motivated by these observations, we proposed a reinforcement learning-based mechanism in this article. We modeled the network traffic prediction problem as a Markov decision process, and then, predicted network traffic by Monte Carlo $Q$-learning. Furthermore, we addressed the real-time requirement of the proposed mechanism and we proposed a residual-based dictionary learning algorithm to improve the complexity of Monte Carlo $Q$-learning. Finally, the effectiveness of our mechanism was evaluated using the real network traffic.
automation & control systems,computer science, interdisciplinary applications,engineering, industrial
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the difficult problem of network traffic prediction in the Intelligent Internet of Things (IIoT). Specifically, the paper focuses on how to effectively carry out end - to - end network traffic prediction in the context of industrial applications to support the complex and heterogeneous characteristics of 5G communication networks. Due to the openness of IIoT networks and the frequently updated topology, the network traffic fluctuates irregularly, which increases the difficulty of prediction. ### Main problems 1. **Challenges in network traffic prediction**: - The complexity and heterogeneity of IIoT networks lead to irregular fluctuations in network traffic. - Traditional statistical models and machine - learning methods are not effective in dealing with IIoT network traffic because these methods are difficult to handle the computational complexity problems brought by high - density - deployed infrastructure and large - scale training data sets. 2. **Requirement for real - time prediction**: - Existing prediction mechanisms are difficult to meet the real - time requirements while ensuring accuracy. - Collecting and processing a large amount of network traffic data consumes a large amount of computing and memory resources, especially on resource - constrained edge nodes. ### Solutions To solve the above problems, the paper proposes a network traffic prediction mechanism based on Reinforcement Learning (RL). The specific contributions are as follows: 1. **Network traffic modeling based on Markov Decision Process (MDP)**: - Model the network traffic prediction problem as an MDP and use the Monte - Carlo Q - learning algorithm for prediction. - Measure the difference between the predicted distribution and the actual distribution through Kullback - Leibler (KL) divergence, thereby optimizing the prediction model. 2. **Improved dictionary learning algorithm**: - Propose a residual - based adaptive dictionary learning algorithm to reduce computational complexity. - Project the network traffic to another space so that the traffic can be represented by fewer coefficients, thereby reducing the number of actions and computational complexity. 3. **Experimental verification**: - Evaluate the proposed mechanism using a real - network - traffic data set. The results show that the Monte - Carlo Q - learning - based mechanism can better capture the short - term characteristics of IIoT network traffic. ### Summary of mathematical formulas - **Q - learning update formula**: \[ Q(s, a)\leftarrow(1 - \alpha)Q(s, a)+\alpha\left(R(s, a)+\gamma\max_{\tilde{a}\in A}Q(\tilde{s},\tilde{a})\right) \] where $\alpha$ is the learning rate, $R(s, a)$ is the immediate reward, and $\gamma$ is the discount factor. - **Optimization objective function**: \[ \min\text{KL}(p(X_t|X_{t - 1})\|p'(X'_t|X'_{t - 1})) \] where $\text{KL}$ represents KL divergence, and $p$ and $p'$ are the distributions of predicted traffic and actual traffic respectively. Through these methods, the paper aims to provide an efficient, real - time and accurate IIoT network traffic prediction mechanism to support complex industrial application scenarios.