Ethernet based multi-FPGA matrix multiplication parallel computing system design

Tian Xiang,Zhou Fan,Chen Yaowu,Liu Li,Chen Yao
DOI: https://doi.org/10.3321/j.issn:0254-3087.2007.08.005
2007-01-01
Abstract:Many application areas, such as process control and image processing, make extensive use of matrix multiplication operations, and the computational performance of this operation is critical for the whole system. In this paper, an Ethernet based double-precision floating-point matrix multiplication parallel computing system was designed. The design was implemented on Xilinx's XUP Virtex-Ⅱ Pro development system. The host in the system is in charge of task partitioning and transferring data to the FPGA computation units. During computation, the host broadcast the matrix elements to all the computation units that need the same data, which reduces the communication overhead of the whole system. The matrix multiplier adopted in FPGA computation unit is optimized for sparse matrix multiplication. It contains a pre-processing module that can avoid the computation of zero element blocks and thus improve the system performance. The theoretical analysis results and the experimental results show that the host system has achieved high performance.
What problem does this paper attempt to address?