Revisiting Convolution and FFT on Parallel Computation Platforms

Haohuan Fu,Robert G. Clapp,Olav Lindtjorn
DOI: https://doi.org/10.1190/1.3513484
2010-01-01
Abstract:Due to the reduction in computational complexity, FFTs are usually performed to enable a faster convolution implementation in seismic computations. However, on current parallel computation platforms, such as multi-core processors, Graphic Processing Units (GPUs), and Field Programmable Gate Arrays (FPGAs), the performance is not only determined by the computational complexity but also relates to other factors, such as the parallelism and the memory access pattern of the algorithm. In our work, we investigate optimized designs of convolutions and FFTs on different parallel computation platforms with different problem and stencil sizes. Experiment results show that, for many stencil sizes used in practical seismic applications, the direct convolution approach demonstrates a better performance than the FFT-based approach on parallel platforms. For 1D cases, the parallel performance of the FFTbased approach is limited by the data dependency of the FFT algorithm. 3D FFT-based approaches use cache poorly. Only 2D FFTs scales well with the parallel computation capacity of modern architectures. The technological trends indicate that these findings will continue.
What problem does this paper attempt to address?