IDLA: an Instruction-based Adaptive CNN Accelerator

Peng Gao,Zhize Huang,Hanchen Ye,Gengsheng Chen
DOI: https://doi.org/10.1109/icsict49897.2020.9278360
2020-01-01
Abstract:In this paper, we propose an instruction-based adaptive CNN accelerator named IDLA for fast and efficient deployments of CNN models on FPGA. The hardware engine of IDLA accelerates the computation of CNN models by adaptively using different functional modules. Following a modular design fashion, the hardware engine is attentively designed to enable all these modules to work concurrently and to improve the usage efficiency of on-chip resources. Besides, layer fusion and weight reuse strategies are applied to reduce data access to DDR. Coordinating with this hardware engine, a network parser is developed to automatically analyze different CNN models to generate an optimal scheduling scheme for each CNN model. Moreover, a customized instruction set with moderate-granularity is designed to further enhance the flexibility in joint-optimization between software and hardware. We build the IDLA on a Xilinx VU9P FPGA. The experimental results show that our proposed IDLA accelerator has reached an overwhelming performance of 168.76 (ResNet18) and 277.63 (VGG16-SVD) GOPS with an DSP efficiency of 1.62 Ops/DSP/cycle (VGG16-SVD), much better than existing advanced works.
What problem does this paper attempt to address?