TAPS: Topology-Aware Intra-Operator Parallelism Strategy Searching Algorithm for Deep Neural Networks

Peng Liang,Hao Zheng,Teng Su,Linbo Qiao,Dongsheng Li
DOI: https://doi.org/10.48550/arXiv.2301.04285
2023-01-11
Distributed, Parallel, and Cluster Computing
Abstract:TAPS is a Topology-Aware intra-operator Parallelism strategy Searching algorithm that generates intra-operator parallelism strategies by considering both intra-node and inter-node bandwidth. Most of the existing auto-parallelism works use the communication volume as the communication cost directly when generating strategies, which we prove to be sub-optimal in multi-nodes cases. We design a topology-aware cost model for multi-node intra-operator parallelism strategy searching. Numerical experiments demonstrate that TAPS can generate strategies with up to 85% fewer communication costs, which outperform the latest baselines.
What problem does this paper attempt to address?