Communication Analysis And Performance Prediction Of Parallel Applications On Large-Scale Machines

Yan Li,Jidong Zhai,Keqin Li
DOI: https://doi.org/10.4018/978-1-5225-0287-6.ch005
2016-01-01
Abstract:With the development of high performance computers, communication performance is a key factor affecting the performance of HPC applications. Communication patterns can be obtained by analyzing communication traces. However, existing approaches to generating communication traces need to execute the entire parallel applications on full-scale systems that are time-consuming and expensive. Furthermore, for designers of large-scale parallel computers, it is greatly desired that performance of a parallel application can be predicted at the design phase. Despite previous efforts, it remains an open problem to estimate sequential computation time in each process accurately and efficiently for large-scale parallel applications on non-existing target machines. In this chapter, we will introduce a novel technique for performing fast communication trace collection for large-scale parallel applications and an automatic performance prediction framework with a trace-driven network simulator.
What problem does this paper attempt to address?