An effective method for operations placement in Tensor Flow

Junnan Liu,Chengfan Jia,Junshi Chen,Han Lin,Xu Jin,Hong An
DOI: https://doi.org/10.1145/3318265.3318270
2019-01-01
Abstract:Recent works in deep learning have shown that large neural networks can dramatically improve performance, followed by is the growth of computational requirements for hardware. To address those requirements, a common approach is to train those models on heterogeneous systems with a mixture of hardware devices such as CPUs and GPUs. Normally, the decision of putting parts of neural networks on devices is made by researchers based on heuristics algorithm. In this paper, we introduce an effective method to optimize operations placement for TensorFlow computational graphs on heterogeneous systems by using deep neural networks to predict devices for each operation in a target computational graph. Based on reinforcement learning, our method learns to group operations and assign each group to a corresponding device. To take advantage of the information of operations, we use a fully-connected network to group operations. In addition, we use the actual running time of the predictive placement as rewards to train the predictive network by using policy gradients. By executing the most widely used models in computer vision and machine translation, our method finds an optimized placement which outperforms human experts. When applying our method to the Neural Machine Translation model on the WMT14 German-English dataset, the execution time of per single training step reduces up to 28.41%.
What problem does this paper attempt to address?