Design of a Convolutional Neural Network Accelerator Based on On-Chip Data Reordering

Yang Liu,Yiheng Zhang,Xiaoran Hao,Lan Chen,Mao Ni,Ming Chen,Rong Chen

DOI: https://doi.org/10.3390/electronics13050975

IF: 2.9

2024-03-05

Electronics

Abstract:Convolutional neural networks have been widely applied in the field of computer vision. In convolutional neural networks, convolution operations account for more than 90% of the total computational workload. The current mainstream approach to achieving high energy-efficient convolution operations is through dedicated hardware accelerators. Convolution operations involve a significant amount of weights and input feature data. Due to limited on-chip cache space in accelerators, there is a significant amount of off-chip DRAM memory access involved in the computation process. The latency of DRAM access is 20 times higher than that of SRAM, and the energy consumption of DRAM access is 100 times higher than that of multiply–accumulate (MAC) units. It is evident that the "memory wall" and "power wall" issues in neural network computation remain challenging. This paper presents the design of a hardware accelerator for convolutional neural networks. It employs a dataflow optimization strategy based on on-chip data reordering. This strategy improves on-chip data utilization and reduces the frequency of data exchanges between on-chip cache and off-chip DRAM. The experimental results indicate that compared to the accelerator without this strategy, it can reduce data exchange frequency by up to 82.9%.

engineering, electrical & electronic,computer science, information systems,physics, applied

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are the "memory wall" and "power consumption wall" problems faced by Convolutional Neural Networks (CNNs) during the calculation process. Specifically, since convolution operations account for more than 90% of the total CNN calculations, and these operations involve a large amount of weight data and input feature map data, it is necessary to frequently read data from off - chip DRAM when calculating in the accelerator. The latency and energy consumption of DRAM access are much higher than those of on - chip SRAM and multiply - accumulate (MAC) units, which not only reduces the calculation speed but also significantly increases the power consumption. To solve these problems, the paper proposes a hardware accelerator design method based on on - chip data re - ordering. This method improves the utilization rate of on - chip data by optimizing the data flow strategy and reduces the data exchange frequency between the on - chip cache and the off - chip DRAM. The experimental results show that compared with accelerators without this strategy, this method can reduce the data exchange frequency by up to 82.9%. In short, this paper aims to improve the energy efficiency and computational efficiency of CNN accelerators by improving the data reuse strategy, thereby reducing the dependence on off - chip memory.

Design of a Convolutional Neural Network Accelerator Based on On-Chip Data Reordering

A Convolutional Neural Network Accelerator Architecture with Fine-Granular Mixed Precision Configurability.

DaDianNao: A Machine-Learning Supercomputer

An Enhanced Data Cache with In-Cache Processing Units for Convolutional Neural Network Accelerators

A High-Efficient and Configurable Hardware Accelerator for Convolutional Neural Network

A High Efficient Architecture for Convolution Neural Network Accelerator

A Low-Latency DNN Accelerator Enabled by DFT-Based Convolution Execution Within Crossbar Arrays

A Reconfigurable Spatial Architecture for Energy-Efficient Inception Neural Networks

A Parallel Loading Based Accelerator for Convolution Neural Network

Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA

Efficient Hardware Optimization Strategies For Deep Neural Networks Acceleration Chip

Energy-Efficient Accelerator Design for Deformable Convolution Networks

RRAM Based Buffer Design for Energy Efficient CNN Accelerator.

A High Performance Reconfigurable Hardware Architecture for Lightweight Convolutional Neural Network

A Convolution Neural Network Accelerator Design with Weight Mapping and Pipeline Optimization

Memory-centric accelerator design for Convolutional Neural Networks

An Efficient Accelerator for Sparse Convolutional Neural Networks

Design of a Generic Dynamically Reconfigurable Convolutional Neural Network Accelerator with Optimal Balance

A 3D Tiled Low Power Accelerator for Convolutional Neural Network

Myocarditis: A clinical entity that can benefit from noninvasive imaging

A Convolutional Neural Network Accelerator Based on FPGA