Constructing Skeleton for Parallel Applications with Machine Learning Methods
Zihang Zhang,Jingwei Sun,Jiepeng Zhang,Yuze Qin,Guangzhong Sun
DOI: https://doi.org/10.1145/3339186.3339197
2019-01-01
Abstract:Performance prediction has always been important in the domain of parallel computing. For programs which are executed on workstation clusters and super computing systems, precise prediction of execution time can help task scheduling and resource management. A practical and effective type of prediction method is the skeleton-based method. It extracts an executable code snippet, called skeleton, from the traces of program executions, and uses the skeleton to replay the behaviors and predict the performance of the original program. However, traditional skeleton-based methods require fixed inputs to construct reliable skeletons. This requirement limits the application scope of skeleton-based methods. In this paper, we present a novel method to construct skeleton for parallel programs. Our method combines code instrument and machine learning techniques, which enable skeletons to dynamically respond varying inputs and make corresponding performance prediction. In our evaluations on three benchmarks, MCB, LULESH and STREAM, the proposed method can achieve 27%, 7% and 9% average prediction error rate, respectively.