Idle Period Propagation in Message-Passing Applications

Ivy Bo Peng,Stefano Markidis,Erwin Laure,Gokcen Kestor,Roberto Gioiosa
DOI: https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0134
2017-04-27
Abstract:Idle periods on different processes of Message Passing applications are unavoidable. While the origin of idle periods on a single process is well understood as the effect of system and architectural random delays, yet it is unclear how these idle periods propagate from one process to another. It is important to understand idle period propagation in Message Passing applications as it allows application developers to design communication patterns avoiding idle period propagation and the consequent performance degradation in their applications. To understand idle period propagation, we introduce a methodology to trace idle periods when a process is waiting for data from a remote delayed process in MPI applications. We apply this technique in an MPI application that solves the heat equation to study idle period propagation on three different systems. We confirm that idle periods move between processes in the form of waves and that there are different stages in idle period propagation. Our methodology enables us to identify a self-synchronization phenomenon that occurs on two systems where some processes run slower than the other processes.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?