Vector Memory-Access Shuffle Fused Instructions for FFT-Like Algorithms

Liu Sheng,Yuan Bo,Guo Yang,Sun Haiyan,Jiang Zekun
DOI: https://doi.org/10.23919/cje.2021.00.401
IF: 1.019
2023-01-01
Chinese Journal of Electronics
Abstract:The shuffle operations are the bottleneck when mapping the FFT-like algorithms to the vector single instruction multiple data (SIMD) architectures. We propose six (three pairs) innovative vector memory-access shuffle fused instructions, which have been proved mathematically. Combined with the proposed modified binary-exchange method, the innovative instructions can efficiently address the bottleneck problem for decimationin-frequency or decimation-in-time (DIF/DIT) radix-2/4 FFT-like algorithms, reach a performance improvement by 17.9%-111.2% and reduce the code size by 5.4%-39.8%. In addition, the proposed instructions fit some hybrid-radix FFTs and are suitable for the terms of the initial or result data placement for general algorithms. The software and hardware costs of the proposed instructions are moderate.
What problem does this paper attempt to address?