Analysis of Performance and Optimization in MindSpore on Ascend NPUs

Bangchuan Wang,Chuying Yang,Rui Zhu,Xiao Liu,Mingyao Zhou,Nenggan Zheng
DOI: https://doi.org/10.1109/icpads60453.2023.00237
2023-01-01
Abstract:With the rapid advancement of artificial intelligence, the complexity and depth of deep neural networks continues to grow, placing higher demands on computational power. To meet these requirements, various manufacturers have developed specialized computing processors for the training process of deep learning, such as Huawei’s Ascend Neural Processing Unit (NPU). In order to fully leverage the capabilities of the Ascend NPU, Huawei has introduced the MindSpore deep learning framework. The computational performance of deep learning frameworks plays a critical role for developers. However, there is a lack of comprehensive research on the analysis of performance and optimization in MindSpore framework on the Ascend NPU, leading to a scarcity of relevant references for deep learning development utilizing MindSpore on the Ascend NPU. To address this gap, this study examined the performance and optimization of MindSpore on Ascend NPUs through detailed experiments involving diverse workloads and multiple analysis metrics. The analysis is conducted at three levels: operations, models, and techniques, investigating the operations configurations, performance bottleneck, along with techniques selections, within the training process of DNNs. Consequently, this study offers significant insights and guidance for DL researchers and practitioners using MindSpore on Ascend NPUs.
What problem does this paper attempt to address?