Optimizing Pipelined Execution For Distributed In-Memory Olap System

Li Wang,Lei Zhang,Chengcheng Yu,Aoying Zhou
DOI: https://doi.org/10.1007/978-3-662-43984-5_15
2014-01-01
Abstract:In the coming big data era, the demand for data analysis capability in real applications is growing at amazing pace. The memory's increasing capacity and decreasing price make it possible and attractive for the distributed OLAP system to load all the data into memory and thus significantly improve the data processing performance. In this paper, we model the performance of pipelined execution in distributed in-memory OLAP system and figure out that the data communication among the computation nodes, which is achieved by data exchange operator, is the performance bottleneck. Consequently, we explore the pipelined data exchange in depth and give a novel solution that is efficient, scalable, and skew-resilient. Experimental results show the effectiveness of our proposals by comparing with state-of-art techniques.
What problem does this paper attempt to address?