A Comprehensive Framework for Synthesizing Stencil Algorithms on FPGAs Using OpenCL Model

Shuo Wang,Yun Liang
DOI: https://doi.org/10.1145/3061639.3062185
2017-01-01
Abstract:Iterative stencil algorithms find applications in a wide range of domains. FPGAs have long been adopted for computation acceleration due to its advantages of dedicated hardware design. Hence, FPGAs are a compelling alternative for executing iterative stencil algorithms. However, efficient implementation of iterative stencil algorithms on FPGAs is very challenging due to the data dependencies between iterations and elements in the stencil algorithms, programming hurdle of FPGAs, and large design space. In this paper, we present a comprehensive framework that synthesizes iterative stencil algorithms on FPGAs efficiently. We leverage the OpenCL-to-FPGA toolchain to generate accelerator automatically and perform design space exploration at the high level. We propose to bridge the neighboring tiles through pipe and enable data sharing among them to improve computation efficiency. Then, we extend the equal tile size design to a heterogeneous design with different tile size to balance the computation among different tiles. We also develop analytical performance models to explore the complex design space. Experiments using a wide range of stencil applications demonstrate that on average our heterogeneous implementations achieve 1.65X performance speedup but with less hardware resource compared to the state-of-the-art.
What problem does this paper attempt to address?