Flexible Acceleration Framework for Dense/Sparse Matrix Multiplication on Versal ACAP

Youwei Xiao,Yun Liang
DOI: https://doi.org/10.1109/ISEDA59274.2023.10218584
2023-01-01
Abstract:Matrix multiplication (MM) is one of the most commonly applied operations in various application domains, including deep learning, recommendation system, robotics, etc. AMD Xilinx Versal ACAP combines optimized ASIC engines (AIE) and programmable logic (FPGA) with hardened DSPs and becomes the ideal platform for diverse MM-intensive applications due to strong computing power and sufficient reconfigurability. However, the architecture heterogeneity also brings outstanding programming challenges and constitutes barriers for non-expert users to accelerate their target applications. The situation even worsens when the acceleration system is composed of various accelerators, such as dense or sparse MM accelerators under different configurations. In this paper, we propose FlexVA, a flexible acceleration framework, which supports general operations, including both dense and sparse matrix multiplication, and targets Versal ACAP architecture. FlexVA provides convenient description tools to program accelerator candidates under flexible configurations as accelerator templates with parameters. The framework not only allows users to select, configure, and compose accelerators flexibly but also provides an XRT-based software runtime with scheduling strategies that are also configurable by users. We implement and optimize both the dense-dense (GEMM) and sparse-dense (SpMM) accelerators with FlexVA.
What problem does this paper attempt to address?