Graph Attention Neural Network Distributed Model Training

Maryam Heidari,Armin Esmaeilzadeh,Mina Esmail Zadeh Nojoo Kambar
DOI: https://doi.org/10.1109/aiiot54504.2022.9817156
2022-06-06
Abstract:The scale of neural language models has been increasing significantly over recent years. As a result, the time complexity of training larger language models and resource utilization has been increasing at a higher rate as well. In this research, we propose a distributed implementation of a Graph Attention Neural Network model with 120 million parameters and train it on a cluster of eight GPUs. We demonstrate three times speedup in model training while keeping the stability of accuracy and loss rates during training and testing compared to single GPU instance training.
Computer Science
What problem does this paper attempt to address?