Accelerating Convolutional Processing by Harnessing Channel Shifts in Arrayed Waveguide Gratings

Dan Yi,Caiyue Zhao,Zunyue Zhang,Hongnan Xu,Hon Ki Tsang
DOI: https://doi.org/10.1002/lpor.202400435
2024-08-27
LASER & PHOTONICS REVIEWS
Abstract:A novel convolutional processor is proposed using the shifted spectral response of a pair of arrayed waveguide gratings (AWGs) to mimic the kernel shifts during image convolution. This inherent mixing of inputs in the AWG's spectral response eliminates the need for repetitive element‐wise computations while enabling the simultaneous generation of convolved output maps. Convolutional neural networks are a powerful category of artificial neural networks that can extract features from raw data to provide greatly reduced parametric complexity and enhance pattern recognition and the accuracy of prediction. Optical neural networks offer the promise of dramatically accelerating computing speed while maintaining low power consumption even when using high‐speed data streams running at hundreds of gigabit/s. Here, we propose an optical convolutional processor (CP) that leverages the spectral response of an arrayed waveguide grating (AWG) to enhance convolution speed by eliminating the need for repetitive element‐wise multiplication. Our design features a balanced AWG configuration, enabling both positive and negative weightings essential for convolutional kernels. A proof‐of‐concept demonstration of an 8‐bit resolution processor is experimentally implemented using a pair of AWGs with a broadband Mach–Zehnder interferometer (MZI) designed to achieve uniform weighting across the whole spectrum. Experimental results demonstrate the CP's effectiveness in edge detection and achieved 96% accuracy in a convolutional neural network for MNIST recognition. This approach can be extended to other common operations, such as pooling and deconvolution in Generative Adversarial Networks. It is also scalable to more complex networks, making it suitable for applications like autonomous vehicles and real‐time video recognition.
What problem does this paper attempt to address?