Asynchronous MPI for the Masses

Markus Wittmann,Georg Hager,Thomas Zeiser,Gerhard Wellein
DOI: https://doi.org/10.48550/arXiv.1302.4280
2013-02-18
Abstract:We present a simple library which equips MPI implementations with truly asynchronous non-blocking point-to-point operations, and which is independent of the underlying communication infrastructure. It utilizes the MPI profiling interface (PMPI) and the MPI_THREAD_MULTIPLE thread compatibility level, and works with current versions of Intel MPI, Open MPI, MPICH2, MVAPICH2, Cray MPI, and IBM MPI. We show performance comparisons on a commodity InfiniBand cluster and two tier-1 systems in Germany, using low-level and application benchmarks. Issues of thread/process placement and the peculiarities of different MPI implementations are discussed in detail. We also identify the MPI libraries that already support asynchronous operations. Finally we show how our ideas can be extended to MPI-IO.
Distributed, Parallel, and Cluster Computing,Performance
What problem does this paper attempt to address?