An Efficient Implementation of FFT Based on CGRA

Wei Jinhe,Yang Jinjiang,Li Hui,Wu Youyu
DOI: https://doi.org/10.1109/iccsnt.2017.8343746
2017-01-01
Abstract:This paper presents an efficient implementation of complex FFT algorithm on REMUS-II_MB, which is a CGRA-based reconfigurable architecture. The implementation is divided into two steps. The local sequential stages are performed on the RCAs independently at the first step and the cross parallel stages with data communications are processed at the second stage. The performance of this work is improved by employing two technologies, namely pipeline bubble elimination and data block location rearrangement. Compared with other parallel FFT implementations, the proposed one on REMUS-II_MB has the performance advantage by 1.15 to 12.6 times with little local memory cost.
What problem does this paper attempt to address?