Optimal collective communication algorithms in grid computing system

Yanhui Wu,Xinda Lu
2005-01-01
Abstract:With computational grid, running parallel program on large-scale and geographically distributed computer systems is feasible. It is a difficult task to write parallel applications which may require changing the communication structure of the applications. MPI's collective operations allow for some of these changes to be hidden from the applications programmer. We have developed optimal collective communication algorithms optimized for wide area systems and have taken hierarchical network structure into account. Both the bandwidth and the latency of the LAN and WAN links differ by almost two orders of magnitude. Our algorithms are designed to send the minimal amount of data over the slow wide area links, and to only incur a single wide area latency. Compared to MPICH that do not consider the topology, large performance improvements are possible.
What problem does this paper attempt to address?