HW/SW approaches to accelerate GRAPES in an FU array

Wei Wang,Jun Yao,Youhui Zhang,Wei Xue,Yasuhiko Nakashima,Weimin Zheng
DOI: https://doi.org/10.1109/CoolChips.2013.6547920
2013-01-01
Abstract:In this research, a high performance computing weather forecasting application GRAPES has been tuned onto a functional unit (FU) array based architecture. Software and hardware approaches are specifically employed to increase the data locality and data reuse to accelerate the stencil computation in GRAPES. The simulation results indicate that we can achieve a per-core average IPC of 12.3 within a 20-stage FU array processor, which has a 5.8x power-efficiency boost than the many-core processor (MCP) of a same process technology. This can accordingly slow down the increase of communication by one order in the cluster system, resulting in a 12x power-efficiency boost in all PEs. © 2013 IEEE.
What problem does this paper attempt to address?