Abstract:With the rapidly-developing high-speed wireless communications, the 60 GHz millimeter-wave frequency range and radio-over-fiber systems have been investigated as a promising solution to deliver mm-wave signals. Neural networks have been studied to improve the mm-wave RoF system performances at the receiver side by suppressing linear and nonlinear impairments. However, previous neural network studies in mm-wave RoF systems focus on the off-line implementation with high-end GPUs , which is not practical for low power-consumption, low-cost and limited computation platform applications. To solve this issue, we investigate neural network hardware accelerator implementations using the field programmable gate array (FPGA), taking advantage of the low power consumption, parallel computation capability, and reconfigurablity features of FPGA. Convolutional neural network (CNN) and binary convolutional neural network (BCNN) hardware accelerators are demonstrated. In addition, to satisfy the low-latency requirement in mm-wave RoF systems and to enable the use of low-cost compact FPGA devices, a novel inner parallel optimization method is proposed. Compared with the embedded processor (ARM Cortex A9) execution latency, the CNN/BCNN FPGA-based hardware accelerator reduces their latency by over 92%. Compared with non-optimized FPGA implementations, the proposed optimization method reduces the processing latency by over 44% for CNN and BCNN. Compared with the GPU implementation, the latency of CNN implementation with the proposed optimization method is reduced by 85.49%, while the power consumption is reduced by 86.91%. Although the latency of BCNN implementation with the proposed optimization method is larger compared with the GPU implementation, the power consumption is reduced by 86.14%. The FPGA-based neural network hardware accelerators provide a promising solution for mm-wave RoF systems.

Accelerating RNNs on FPGA with HBM

High-performance Reconfigurable DNN Accelerator on a Bandwidth-limited Embedded System

FPGA-based Accelerator for Long Short-Term Memory Recurrent Neural Networks

FPGA Acceleration of Recurrent Neural Network Based Language Model

Implementation and Optimization of the Accelerator Based on FPGA Hardware for LSTM Network

The implementation of a Deep Recurrent Neural Network Language Model on a Xilinx FPGA

FiC-RNN: A Multi-FPGA Acceleration Framework for Deep Recurrent Neural Networks

High-Performance FPGA-Based CNN Accelerator with Block-Floating-Point Arithmetic.

FPGA-based Neural Network Accelerator for Millimeter-Wave Radio-over-Fiber Systems

Recurrent Neural Networks Hardware Implementation on FPGA

H2PIPE: High throughput CNN Inference on FPGAs with High-Bandwidth Memory

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

A High-Performance Accelerator for Large-Scale Convolutional Neural Networks

A Block-Floating-Point Arithmetic Based FPGA Accelerator for Convolutional Neural Networks

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

Adaptive design and implementation of automatic modulation recognition accelerator

FP-BNN: Binarized neural network on FPGA

A high-throughput scalable BNN accelerator with fully pipelined architecture

A High Performance Reconfigurable Hardware Architecture for Lightweight Convolutional Neural Network

A Power-Efficient Accelerator Based on FPGAs for LSTM Network

A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks