A high performance FFT library with single instruction multiple data (SIMD) architecture

Wang Xu,Zhang Yan,Ding Shunying
DOI: https://doi.org/10.1109/ICECC.2011.6066463
2011-01-01
Abstract:Fast Fourier Transform (FFT) is the basis of Digital Signal Processing (DSP). In this paper, a high performance FFT library using radix-2 decimation in frequency (DIF) algorithm is presented which is well suited for SIMD architecture. SIMD architecture microprocessors, such as Intel and AMD, allow parallel floating point operations on contiguous data in memory. A 128-point FFT based radix-2 DIF algorithm is implemented on the Intel architecture. All arithmetic operations in FFT are optimized by SSE assembly. Twiddle factors and binary reverse array are also optimized for SIMD architecture. The library is implemented using C and Intel Streaming SIMD Extensions (SSE) assembly instructions. The performance comparison with Fastest Fourier Transform in the West (FFTW) library shows that the proposed FFT library is faster. © 2011 IEEE.
What problem does this paper attempt to address?