TensorFlow as a DSL for stencil-based computation on the Cerebras Wafer Scale Engine

Nick Brown,Brandon Echols,Justs Zarins,Tobias Grosser
DOI: https://doi.org/10.48550/arXiv.2210.04795
2022-08-27
Abstract:The Cerebras Wafer Scale Engine (WSE) is an accelerator that combines hundreds of thousands of AI-cores onto a single chip. Whilst this technology has been designed for machine learning workloads, the significant amount of available raw compute means that it is also a very interesting potential target for accelerating traditional HPC computational codes. Many of these algorithms are stencil-based, where update operations involve contributions from neighbouring elements, and in this paper we explore the suitability of this technology for such codes from the perspective of an early adopter of the technology, compared to CPUs and GPUs. Using TensorFlow as the interface, we explore the performance and demonstrate that, whilst there is still work to be done around exposing the programming interface to users, performance of the WSE is impressive as it out performs four V100 GPUs by two and a half times and two Intel Xeon Platinum CPUs by around 114 times in our experiments. There is significant potential therefore for this technology to play an important role in accelerating HPC codes on future exascale supercomputers.
Distributed, Parallel, and Cluster Computing,Performance
What problem does this paper attempt to address?