PipePrune: Pipeline Parallel Based on Convolutional Layer Pruning for Distributed Deep Learning.

d tan,w jiang,s qin,h jin
DOI: https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00361
2021-01-01
Abstract:Benefitting from the combination of the idea of pipeline with model parallelism and data parallelism, pipeline parallelism improves the efficiency of distributed deep learning systems significantly. However, suffering from the bubbles and gaps caused by the imbalance of different stages in pipeline, it can not output ideal performance yet. To explore the potential of pipeline parallelism further, we propose a novel approach called PipePrune, which adds a convolutional layer pruning strategy to pipeline creatively to reduce the bubbles and gaps. For the convolutional layers with heavy overheads, some unimportant kernels are pruned by the measurement of the L1-norm. This approach makes the processing overheads of different pipeline stages more balanced. The experimental results show that, compared with state-of-the-art pipeline methods, PipePrune can improve the training speeds obviously (e.g. for ResNet50 on ImageNet, about 30% speed improvement is realized with only 1.1% loss of training accuracies).
What problem does this paper attempt to address?