An Efficient FPGA-based Accelerator Design for Convolution

Peng-Fei Song,Jeng-Shyang Pan,Chun-Sheng Yang,Chiou-Yng Lee
DOI: https://doi.org/10.1109/icawst.2017.8256507
2017-01-01
Abstract:Number theoretic transform with the modular arithmetic operations can perform convolution efficiently in a ring without round-off errors. In this paper, a new efficient architecture of the transform have been proposed which support a various operand size. To have a balanced trade-off between area and latency, a variant constant geometry architecture is used which the forward and backward sub-stage used the same computation pattern. In addition, a XOR-based multi-ported RAM is adopted to accelerate the memory access which allow multiple simultaneous reads and writes efficiently. As a result, the developed accelerator can achieve lower area-latency FPGA compared to other designs.
What problem does this paper attempt to address?