Multi-Parameter Performance Modeling Based on Machine Learning with Basic Block Features

Meng Hao,Weizhe Zhang,Yiming Wang,Dong Li,Wen Xia,Hao Wang,Chen Lou
DOI: https://doi.org/10.1109/ispa-bdcloud-sustaincom-socialcom48970.2019.00054
2019-01-01
Abstract:Considering the increasing complexity and scale of HPC architecture and software, the performance modeling of parallel applications on large-scale HPC platforms has become increasingly important. It plays an important role in many areas, such as performance analysis, job management, and resource estimation. In this work, we propose a multi-parameter performance modeling and prediction framework called MPerfPred, which utilizes basic block frequencies as features and uses machine learning algorithms to automatically construct multi-parameter performance models with high generalization ability. To reduce the prediction overhead, we propose some feature-filtering strategies to reduce the number of features in the training stage and build a serial program called BBF collector for each target application to quickly collect feature values in the prediction stage. We demonstrate the use of MPerfPred on the TianHe-2 supercomputer with six parallel applications. Results show that MPerfPred with SVR achieves better prediction than other input parameter-based modeling methods. The average prediction error and average standard deviation of prediction errors of MPerfPred are 8.42% and 6.09%, respectively. In the prediction stage, the average prediction overhead of MPerfPred is less than 0.13% of the total execution time.
What problem does this paper attempt to address?