IDCOS: Optimization Strategy for Parallel Complex Expression Computation on Big Data

Song Yang,Jin Helin,Wang Hongzhi,Liu You
DOI: https://doi.org/10.1007/s11227-021-03674-y
2021-01-01
Abstract:Complex expressions are the basis of data analytics. To process complex expressions on big data efficiently, we developed a novel optimization strategy for parallel computation platforms such as Hadoop and Spark. We attempted to minimize the rounds of data repartition to achieve high performance. Aiming at this goal, we modeled the expression as a graph and developed a simplification algorithm for this graph. Based on the graph, we converted the round minimization problem into a graph decomposition problem and developed a linear algorithm for it. We also designed appropriated implementation for the optimization strategy. Extensive experimental results demonstrate that the proposed approach could optimize the computation of complex expressions effectively with small cost.
What problem does this paper attempt to address?