Abstract:Automatic modulation recognition-oriented Deep Neural Networks (ADNNs) have achieved higher recognition accuracy than traditional methods with less labor overhead. However, their high computation complexity usually far exceeds the computation capacity of communication devices built on Field Programmable Gate Array (FPGA) platform. When solving the problem of insufficient resources, the complete operation of FPGA-based accelerator can be achieved by dividing the calculation into several parts and calculating them separately, but this will cause unaccepTable latency. In this backdrop, we develop a new ADNN model, named VT-CNN2+, to promote the recognition accuracy. Then, after stating the resources and latency problems for implementing VT-CNN2+ on the FPGA platform, an adaptive hardware accelerator is put forward. To implement the accelerator, Area folding is introduced to optimize resources consumption. Moreover, Literacy Optimization, Parallelism Optimization, Inter-layer Cascading, Temporary Cache and Data Loading Optimization are adopted to reduce latency. Afterwords, the two components in our accelerator are detailed, i.e., Once-designed module and Re-designed module. Finally, to evaluate the performance and adaptivity of our accelerator, a series of experiments are conducted on two different FPGA platforms, i.e., AX7350 and ZedBoard. Results show that our accelerator can successfully adapt to different FPGA platforms and it can remarkably reduce the processing latency. Moreover, our accelerator's processing speed of 0.066249s per single data sample with much lower energy consumption is one order of magnitude faster than desktop-level Central Processing Units (CPUs), two orders of magnitude faster than embedded CPUs.

A Near Memory Computing FPGA Architecture for Neural Network Acceleration

An All-Digital Compute-In-Memory FPGA Architecture for Deep Learning Acceleration

[DL] A Survey of FPGA-based Neural Network Inference Accelerators

A Survey of FPGA-Based Neural Network Accelerator

An Energy-Efficient Near-Data Processing Accelerator for DNNs that Optimizes Data Accesses

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA.

Overcoming Data Transfer Bottlenecks in FPGA-based DNN Accelerators Via Layer Conscious Memory Management

DGNN-Booster: A Generic FPGA Accelerator Framework For Dynamic Graph Neural Network Inference

An Overview of FPGA Based Deep Learning Accelerators: Challenges and Opportunities.

New paradigm of FPGA-based computational intelligence from surveying the implementation of DNN accelerators

Adaptive design and implementation of automatic modulation recognition accelerator

SECDA: Efficient Hardware/Software Co-Design of FPGA-based DNN Accelerators for Edge Inference

FP-DNN: an Automated Framework for Mapping Deep Neural Networks Onto FPGAs with RTL-HLS Hybrid Templates

HAO: Hardware-aware neural Architecture Optimization for Efficient Inference

A Convolutional Neural Network Accelerator Based on FPGA

A Low-Latency DNN Accelerator Enabled by DFT-Based Convolution Execution Within Crossbar Arrays

Towards Power Efficient DNN Accelerator Design on Reconfigurable Platform

A Low Power and Low Latency FPGA-Based Spiking Neural Network Accelerator

Heterogeneous Systems with Reconfigurable Neuromorphic Computing Accelerators

Invited: Algorithm-Software-Hardware Co-Design for Deep Learning Acceleration

An FPGA-Based Neural Network Overlay for ADAS Supporting Multi-Model and Multi-Mode