Performance prediction of parallel computing models to analyze cloud-based big data applications

Chao Shen,Weiqin Tong,Kim-Kwang Raymond Choo,Samina Kausar
DOI: https://doi.org/10.1007/s10586-017-1385-3
2017-11-23
Cluster Computing
Abstract:Abstract Performance evaluation of cloud center is a necessary prerequisite to fulfilling contractual quality of service, particularly in big data applications. However, effectively evaluating performance of cloud services is challenging due to the complexity of cloud services and the diversity of big data applications. In this paper, we propose a performance evaluation model for parallel computing models deployed in cloud centers to support big data applications. In this evaluation model, a big data application is divided into lots of parallel tasks and the task arrivals follow a general distribution. In our approach, we also consider factors associated with resource heterogeneity, resource contention among cloud nodes, and data storage strategy, which have an impact on the performance of parallel computing models. Our model also allows us to calculate key performance indicators of cloud center such as mean number of tasks in the system, probability that a task obtains immediate service, and task waiting time. The model can also be used to predict the time of performing applications. We then demonstrate the utility of the model based on simulations and benchmarking using WordCount and TeraSort applications.
What problem does this paper attempt to address?