A Swap Dominated Tensor Re-Generation Strategy for Training Deep Learning Models

Lijie Wen,Zan Zong,Li Lin,Leilei Lin
DOI: https://doi.org/10.1109/ipdps53621.2022.00101
2022-01-01
Abstract:With the growing of the depth of neural networks and the scale of data, the difficulty of network training also increases. When the GPU memory is insufficient, it is challenging to train deeper models. Recent research uses tensor swapping and recomputation techniques in a combined manner to optimize the memory usage. However, complex dependencies of the DNN graph limit the improvement of the single GPU memory optimization. Improper swap decisions even brings negative effects because the source of the recomputation may have been swapped out. In this paper, we propose a novel swap dominated tensor re-generation strategy, called STR, which combines swap and recomputation techniques to find the optimal execution plan for the DNN training when the memory is limited. We formalize our memory optimization problem with constraints which describe the dependency of the operator calculation and the bandwidth usage of swap. A host checkpoint mechanism is designed to make full use of the swapped tensors, which reduces the cost of the recomputation. We also present an approximation method based on a recursive source tracing procedure to improve the optimization efficiency. We implement a prototype of STR as a plugin on TensorFlow. The experimental result shows that STR improves up to 21.3% throughput compared with the state-of-the-art hybrid optimization strategy.
What problem does this paper attempt to address?