A Hardware Accelerator for Standard Convolution and Depthwise Convolution

Fubang An,Wei Cao,Xuegong Zhou,Lingli Wang
DOI: https://doi.org/10.1109/cstic58779.2023.10219203
2023-01-01
Abstract:In this paper, a CNN hardware accelerator for standard convolution and depthwise convolution is proposed. The accelerator can support two different data flow modes. A computation array composed of DSP is designed to support the parallel strategy of input channel and output channel for standard convolution efficiently. A Membank architecture is designed to make the computation array more efficient for the parallel strategy of input feature map and kernel for depthwise convolution. The accelerator can accelerate EfficientNet on Xilinx Alveo U280 at the system clock of 300MHz and the DSP clock of 600MHz. The results show that the accelerator can achieve 1.14× throughput and 1.12× throughput/DSP compared with the latest FPGA-based accelerator on the same data center accelerator card.
What problem does this paper attempt to address?