Optimized MPI Collective Algorithms for Dragonfly Topology
Guangnan Feng,Dezun Dong,Yutong Lu
DOI: https://doi.org/10.1145/3524059.3532380
2022-01-01
Abstract:The Message Passing Interface (MPI) is the most prominent and dominant programming model for scientific computing in supercomputing systems today. Although many general and efficient algorithms have been proposed for MPI collective operations, there is still room for topology-aware optimization. Dragonfly is a high-scalability, low-diameter, and cost-efficient network topology adopted in more and more supercomputing networks. However, Dragonfly topology limits the performance of some MPI collective operations. In this paper, our analysis shows that the bottlenecks of collective algorithms in Dragonfly topology are intra-job interference, inter-job interference, and topology mismatch. We propose 5 different optimizations, i.e., Pseudo-random Pairwise, Tree-based Shuffle, Reversed Recursive Doubling, Reordered Bruck, and Matched Rabenseifner, for MPI collective operations including All-Gather, All-to-All, All-Reduce, and Reduce-Scatter. We evaluate each optimization through CODES network simulation framework with minimal, non-minimal, and adaptive routing. The simulation results demonstrate that the performance of All-to-All, All-Gather, All-Reduce, and Reduce-Scatter can be improved by 4.7x, 3.4x, 12.7%, and 4.1x, respectively, for 32768-node jobs with adaptive routing.