Phasor-Driven Acceleration for FFT-based CNNs

Eduardo Reis,Thangarajah Akilan,Mohammed Khalid
2024-06-01
Abstract:Recent research in deep learning (DL) has investigated the use of the Fast Fourier Transform (FFT) to accelerate the computations involved in Convolutional Neural Networks (CNNs) by replacing spatial convolution with element-wise multiplications on the spectral domain. These approaches mainly rely on the FFT to reduce the number of operations, which can be further decreased by adopting the Real-Valued FFT. In this paper, we propose using the phasor form, a polar representation of complex numbers, as a more efficient alternative to the traditional approach. The experimental results, evaluated on the CIFAR-10, demonstrate that our method achieves superior speed improvements of up to a factor of 1.376 (average of 1.316) during training and up to 1.390 (average of 1.321) during inference when compared to the traditional rectangular form employed in modern CNN architectures. Similarly, when evaluated on the CIFAR-100, our method achieves superior speed improvements of up to a factor of 1.375 (average of 1.299) during training and up to 1.387 (average of 1.300) during inference. Most importantly, given the modular aspect of our approach, the proposed method can be applied to any existing convolution-based DL model without design changes.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to accelerate the fast Fourier transform (FFT) computation in convolutional neural networks (CNNs). Specifically, the author proposes a method based on the phasor form to reduce the number of operations in frequency - domain convolution, thereby increasing the training and inference speeds of FFT - based CNNs. ### Problem Background Traditional deep convolutional neural networks (DCNNs) require a large amount of computational resources and time when processing large - scale datasets. Although using high - performance GPU clusters can significantly reduce the training time, reducing the number of operations at the algorithm level is still an important research direction. Existing FFT - based CNN methods mainly rely on converting spatial convolution into element - by - element multiplication in the frequency domain to reduce the number of operations. However, the complex multiplications in the frequency domain in these methods still require a large number of floating - point operations (FLOPS). ### Main Contributions of the Paper 1. **Introduction of Phasor Representation**: The paper proposes to use phasors (i.e., the polar coordinate representation of complex numbers) to replace the traditional rectangular representation (real part and imaginary part). In this way, the complex multiplication in the frequency domain can be simplified from 4 real multiplications and 2 real additions to 1 real multiplication and 1 real addition, thus reducing about 3/4 of the FLOPS. 2. **Experimental Verification**: The paper conducts experiments on the CIFAR - 10 and CIFAR - 100 datasets to verify the effectiveness of the proposed method. The results show that during the training phase, the speed of this method can be increased by up to 1.376 times; during the inference phase, the speed can be increased by up to 1.390 times. At the same time, the accuracy of the model remains unchanged. 3. **Universality**: This method has a modular characteristic and can be applied to any existing convolution - based deep - learning model without the need to change the model design. ### Formula Summary - Complex multiplication in rectangular representation: \[ z_1z_2=(a_1a_2 - b_1b_2)+j(a_1b_2 + a_2b_1) \] - Complex multiplication in phasor representation: \[ z_1z_2 = |z_1|\cdot|z_2|\angle(\phi_1+\phi_2) \] ### Conclusion This paper successfully reduces the number of operations in frequency - domain convolution in FFT - based CNNs by introducing the phasor representation, thereby significantly increasing the training and inference speeds. Future research can further explore the application of phasor representation in other implementation methods, such as optimized CUDA kernels or applications on embedded platforms.