Calculational Parallel Programming (Parallel Programming With Homomorphism And Mapreduce)
Zhenjiang Hu
DOI: https://doi.org/10.1145/1863482.1863484
2010-01-01
Abstract:ABSTRACTParallel skeletons are designed to encourage programmers to build parallel programs from ready-made components for which efficient implementations are known to exist, making both parallel programming and parallelization process simpler. Homomorphism and mapReduce are two known parallel skeletons. Homomorphism, widely studied in the program calculation community for more than twenty years, ideally suits the divide-and-conquer parallel computation paradigm over lists, trees, and other general algebraic data types. In addition, it is also equipped with a set of useful theorems for manipulation of homomorphism. On the other hand, mapReduce is a relatively new skeleton but has emerged as one of the most widely used parallel programming platforms for processing data on terabyte and petabyte scales. It allows for easy parallelization of data intensive computations over many machines, and is used daily at companies such as Yahoo!, Google, Amazon, and Facebook. Despite simplicity of these two skeletons, it still remains as a challenge for a programmer to solve his nontrivial problems with these skeletons. Consider, as an example, the known maximum segment sum problem, whose task is to compute the largest possible sum of a consecutive sublists in a given list. It is actually far from being obvious how this problem can be efficiently solved with mapReduce. In this talk, I would like to show a calculational framework that can support systematic development of efficient parallel programs using homomorphism and mapReduce. Being more constructive, this calculational framework for parallel programming is not only helpful in design of efficient parallel programs, but also promising in construction of parallelizing compile.r