Efficient FFT Implementation on a CGRA

Li Yu,Yang Jinjiang,Liu Leibo
DOI: https://doi.org/10.1145/3395260.3395279
2020-01-01
Abstract:In this paper, we present an efficient implementation of FFT algorithm on a CGRA-based reconfigurable architecture. Radix-4 method is used in this paper according to the advantages of proposed CGRA. The performance of the radix-4 FFT implementation is optimized by the parallelism of the Processing Elements (PEs) and the multi-access scheme of the shared memory (SM). Compared with other similar reconfigurable architectures, the proposed FFT implementation on the CGRA has performance advantages. Taking 1024-point FFT as an example, we achieve 1.93X to 6.22X advantages.
What problem does this paper attempt to address?