Improving CNN Performance on FPGA Clusters Through Topology Exploration

Ruihao Li,Ke Liu,Xiaojun Cai,Mengying Zhao,Lizy K. John,Zhiping Jia
DOI: https://doi.org/10.1145/3412841.3441893
2021-01-01
Abstract:Field Programmable Gate Array (FPGA) platform has been a popular choice for deploying Convolution Neural Networks (CNNs) as a result of its high parallelism and low energy consumption. Due to the limited on-chip computation and storage resources, FPGA clusters are becoming promising candidates to improve CNN throughputs. In this paper, we first put forward strategies to optimize the interboard resource allocation in FPGA clusters. Then we model the multi-board cluster problem based on dynamic programming to get the optimal topology of the FPGA clusters. Experimental results show that typical well-known CNNs with our proposed FPGA cluster topology obtains an average throughput 4.33x than single-board solutions and 1.87x than other state-of-the-art multi-board solutions.
What problem does this paper attempt to address?