Performance-sensitive Components Exploration in Spark Streaming

Ying Hou,Yi Liang,Chao Su
DOI: https://doi.org/10.2991/ammee-17.2017.74
2017-01-01
Abstract:Streaming data processing has become a hot topic in the big data research. To ensure the timeliness of data processing, it is important to explore the performance-sensitive components in the streaming data processing platform, which can contribute to the more efficient performance optimization. In this paper, we describe the data processing model in the Spark Streaming, the process can be divided into multiple phases. We propose a simple yet useful method to explore performance-sensitive component components among these phases. Experimental results show that the proposed method is suitable for a wide range of workloads. At last, we demonstrate a detail example of the application of this method on the typical Spark Streaming workload Word count and prove its practicability.
What problem does this paper attempt to address?