Efficient and Flexible Implementation of FFT Application for CGRA Processor

Yashuang Yi,Jiangyuan Gu,Lie Luo,Zhi Wang,Boxiao Han,Hongjun He,Shouyi Yin
DOI: https://doi.org/10.1109/icsp58490.2023.10248572
2023-01-01
Abstract:With the explosive growth of data, this paper uses CGRA for the energy-efficient implementation of FFT, a data-intensive and computation-intensive task. In this paper, the butterfly operation of FFT is converted into DFG, which is mapped to PEA by optimizing memory access and using ping-pong mode. Two sets of operators are computed in parallel, and the computing time of the single layer is reduced by more than half. The method is extensible by taking advantage of the SW/HW reconfigurability of CGRA. By simply modifying the configuration, it can support multi-point FFTs such as 256 and 512. We ran experiments on FFT-256 and FFT-512 to verify its feasibility and efficiency. Therefore, CGRA implements FFT with better flexibility and higher energy efficiency.
What problem does this paper attempt to address?