Overcoming Data Limitations in Internet Traffic Forecasting: LSTM Models with Transfer Learning and Wavelet Augmentation

Sajal Saha,Anwar Haque,Greg Sidebottom

2024-09-20

Abstract:Effective internet traffic prediction in smaller ISP networks is challenged by limited data availability. This paper explores this issue using transfer learning and data augmentation techniques with two LSTM-based models, LSTMSeq2Seq and LSTMSeq2SeqAtn, initially trained on a comprehensive dataset provided by Juniper Networks and subsequently applied to smaller datasets. The datasets represent real internet traffic telemetry, offering insights into diverse traffic patterns across different network domains. Our study revealed that while both models performed well in single-step predictions, multi-step forecasts were challenging, particularly in terms of long-term accuracy. In smaller datasets, LSTMSeq2Seq generally outperformed LSTMSeq2SeqAtn, indicating that higher model complexity does not necessarily translate to better performance. The models' effectiveness varied across different network domains, reflecting the influence of distinct traffic characteristics. To address data scarcity, Discrete Wavelet Transform was used for data augmentation, leading to significant improvements in model performance, especially in shorter-term forecasts. Our analysis showed that data augmentation is crucial in scenarios with limited data. Additionally, the study included an analysis of the models' variability and consistency, with attention mechanisms in LSTMSeq2SeqAtn providing better short-term forecasting consistency but greater variability in longer forecasts. The results highlight the benefits and limitations of different modeling approaches in traffic prediction. Overall, this research underscores the importance of transfer learning and data augmentation in enhancing the accuracy of traffic prediction models, particularly in smaller ISP networks with limited data availability.

Machine Learning

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper primarily aims to address the issue of data limitations in small Internet Service Provider (ISP) networks to improve the accuracy of internet traffic forecasting. Specifically, the paper tackles this problem through the following points: 1. **Data Scarcity Challenge**: - Real-world internet traffic data is diverse and complex, making it very difficult to train accurate predictive models on small datasets. Traditional statistical methods (such as ARIMA, SARIMA, etc.) struggle to handle nonlinear features, while machine learning and deep learning methods, although performing better, require large amounts of historical data. 2. **Application of Transfer Learning**: - Utilizing knowledge transfer from models trained on large-scale datasets to small datasets to improve prediction performance. This approach is particularly suitable for large ISPs managing diverse networks, as obtaining large datasets is often impractical. 3. **Data Augmentation Techniques**: - Using Discrete Wavelet Transform (DWT) for data augmentation to expand the size of the target domain dataset. This helps alleviate the shortcomings of small datasets and enhances the model's generalization ability. 4. **Multi-Target Domain Prediction**: - Developing personalized predictive models for different network segments (such as residential areas, commercial areas, or educational areas). This approach not only simplifies the model development process but also reduces the time and resources required to establish and deploy independent models. Through these methods, the paper aims to explore the effectiveness of combining transfer learning and data augmentation techniques and proposes a systematic framework to determine the minimum dataset size in the target domain that can benefit from transfer learning. This approach provides new insights into solving time series forecasting problems in small ISP networks.

Overcoming Data Limitations in Internet Traffic Forecasting: LSTM Models with Transfer Learning and Wavelet Augmentation

A Survey of Traffic Flow Prediction Methods Based on Long Short-Term Memory Networks

Transfer Learning Based Efficient Traffic Prediction with Limited Training Data

Short-term Traffic Prediction with Deep Neural Networks and Adaptive Transfer Learning

Network Traffic Prediction Based on LSTM and Transfer Learning

Network-scale traffic prediction via knowledge transfer and regional MFD analysis

ConvLSTMTransNet: A Hybrid Deep Learning Approach for Internet Traffic Telemetry

Deep Sequence Modeling for Anomalous ISP Traffic Prediction

LNTP: an End-to-End Online Prediction Model for Network Traffic.

A Transfer Learning–Based LSTM for Traffic Flow Prediction with Missing Data

A Hybrid Prediction Method for Realistic Network Traffic With Temporal Convolutional Network and LSTM

Multi-Step Internet Traffic Forecasting Models with Variable Forecast Horizons for Proactive Network Management

Dynamic Learning Framework for Smooth-Aided Machine-Learning-Based Backbone Traffic Forecasts

Petite term traffic flow prediction using deep learning for augmented flow of vehicles

An Empirical Study on Internet Traffic Prediction Using Statistical Rolling Model

Traffic Prediction with Transfer Learning: A Mutual Information-based Approach

Wavelet-Based Hybrid Machine Learning Model for Out-of-distribution Internet Traffic Prediction

Realtime mobile bandwidth prediction using LSTM neural network and Bayesian fusion

Stacked LSTM for Short-Term Traffic Flow Prediction using Multivariate Time Series Dataset

LSTTN: A Long-Short Term Transformer-based Spatio-temporal Neural Network for Traffic Flow Forecasting

Towards an Ensemble Regressor Model for Anomalous ISP Traffic Prediction