Using ARIMA to Predict the Expansion of Subscriber Data Consumption

Mike Wa Nkongolo
DOI: https://doi.org/10.3390/eng4010006
2024-04-23
Abstract:This study discusses how insights retrieved from subscriber data can impact decision-making in telecommunications, focusing on predictive modeling using machine learning techniques such as the ARIMA model. The study explores time series forecasting to predict subscriber usage trends, evaluating the ARIMA model's performance using various metrics. It also compares ARIMA with Convolutional Neural Network (CNN) models, highlighting ARIMA's superiority in accuracy and execution speed. The study suggests future directions for research, including exploring additional forecasting models and considering other factors affecting subscriber data usage.
Machine Learning
What problem does this paper attempt to address?
This paper mainly discusses how to use ARIMA (Autoregressive Integrated Moving Average) to predict the growth of data consumption in telecommunication companies. By analyzing 730 data points from Insights Data Storage, the researchers used the ARIMA model to predict the data usage trend and compared it with Convolutional Neural Networks (CNNs). The ARIMA model showed a significant p-value (0.007), supporting the prediction of data growth and predicting a maximum growth of up to 14 Gbps. Compared to CNNs, ARIMA was 43 times faster in execution speed, making it more efficient for handling subscriptions with a large amount of historical time series data. The paper highlights the importance of seasonality in time series analysis for predictive performance. It can help remove features and seasonality from the original dataset to obtain standardized data. The ARIMA model was chosen as the time series forecasting model for predicting user data usage and analyzing its seasonality, trend, and cycle. The main question of the study was to determine which model, ARIMA or CNN, is more effective in predicting user data usage. The research objective was to evaluate these two models based on accuracy and computational speed. The paper also discusses the advantages of ARIMA in handling seasonal data and compares it with other models such as LSTM and CNN. The contribution of the paper lies in proposing the use of the ARIMA model with unsupervised learning strategy to predict the growth of user data and specifically implementing the ARIMA model with unlabeled features. By predicting throughput and maximum usage growth, the ARIMA model can study seasonality. Future research directions include further improving prediction models to enhance decision efficiency and prediction accuracy.