Efficient Loop Partitioning for Parallel Codes of Irregular Scientific Computations

MY Guo
DOI: https://doi.org/10.1109/icapp.2002.1173553
2003-01-01
IEICE Transactions on Information and Systems
Abstract:In most distributed memory computations, node programs are executed on processors according to the owner computes rule. However, the owner computes rule is not best suited for irregular application codes. In irregular application codes, use of indirection in accessing left hand side array makes it difficult to partition loop iterations, and using indirection in accessing right hand side elements may reduce total communication by using heuristics other than the owner computes rule. In this paper we propose a communication cost reduction computes rule for irregular loop partitioning, called least communication computes rule. We partition a loop iteration to a processor on which minimal communication cost is ensured when executing that iteration. After all iterations are partitioned into various processors, we give a global vs local data transformation rule, indirection array remapping and communication optimization methods. The experimental results show that, in most cases, our approaches achieved better performance than other loop partitioning rules.
What problem does this paper attempt to address?