Fast: A Fast Stencil Autotuning Framework Based On An Optimal-Solution Space Model

Yulong Luo,Guangming Tan,Zeyao Mo,Ninghui Sun
DOI: https://doi.org/10.1145/2751205.2751214
2015-01-01
Abstract:Stencil computations comprise an important class of kernels in many scientific computing applications. As the diversity of both architectures and programming models grow, autotuning is emerging as a critical strategy for achieving portable performance across a broad range of execution contexts for stencil computations. However, costly tuning overhead is a major obstacle to its popularity. In this work, we propose a fast stencil autotuning framework FAST based on an Optimal-Solution Space (OSS) model to significantly improve tuning speed. It leverages a feature extractor that comprehensively characterizes stencil computation. Using the extracted features, FAST constructs an OSS database to train an off-line model which provides an on-line prediction. We evaluate FAST with five important stencil computation applications on both an Intel Xeon multicore CPU and an NVIDIA Tesla K20c GPU. Compared with state-of-the-art stencil autotuners like Patus and SDSL, FAST improves autotuning speed by 10 - 2697 times without any user annotation, while achieving comparable performance.
What problem does this paper attempt to address?