A comparison of forecasting models for the resource usage of MapReduce applications

Yang Yuan Li,Tien Van Do,Hai T. Nguyen
DOI: https://doi.org/10.1016/j.neucom.2020.07.059
IF: 6
2020-12-01
Neurocomputing
Abstract:<p>In this paper, we construct forecasting models (multivariate long short-term memory recurrent neural networks and multiple linear regression) for the resource usage prediction of four MapReduce applications and applications executed within the Spark framework. We have evaluated the impact of a sample size to prediction accuracy. Also, we propose a phase modelling approach for read/write-intensive applications. Our results show that models based on long short-term memory recurrent neural networks exhibit a higher accuracy than multiple linear regression models and the intensive characteristics of a resource are closely related to the prediction accuracy of forecasting models. We investigated the hyperparameter tuning of such models and showed that a randomly initialised, shallow, well-tuned network may outperform deeper models that use stacked autoencoder initialisation. Furthermore, multivariate long short-term memory recurrent neural network models are more sensitive to sample size than multiple linear regression models. We show that an LSTM model trained in a specific machine may be used to predict the resource usage in another machine.</p>
computer science, artificial intelligence
What problem does this paper attempt to address?