POSTER: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training.

Jiaao He,Shengqi Chen,Jidong Zhai
DOI: https://doi.org/10.1145/3627535.3638481
2024-01-01
Abstract:Recommendation models are an important category of deep learning models whose size is growing enormous. They consist of a sparse part with TBs of memory footprint and a dense part that demands PFLOPs of computing capability to train. Unfortunately, the high sparse communication cost to re-organize data for different parallel strategies of the two parts impedes the scalability in training. Based on observations of sparse access patterns, we design a two-fold fine-grained parallel strategy to accelerate sparse communication. A performance model is built to select an optimal set of items that are replicated across all GPUs so that all-to-all communication volume is reduced, while keeping memory consumption acceptable. The all-to-all overhead is further reduced by parallel scheduling techniques. In our evaluation on 32 GPUs over real-world datasets, 2.16- 16.8x end-to-end speedup is achieved over the baselines.
What problem does this paper attempt to address?