System Integration and Optimization of AI Hardware Acceleration Architecture for Object Detection

Yi-Yen Lai,Chung-Bin Wu,Yen-Ren Hou
DOI: https://doi.org/10.1109/ICCE-Taiwan58799.2023.10226770
2023-07-17
Abstract:This paper proposes a system integration and optimized hardware acceleration design for the lightweight YOLOV3 model in the object detection network architecture, including the Convolution Layer, the Maxpooling Layer, the Detection Layer, the Shortcut layer, and the optimized i output layers. In addition, this paper is verified and implemented in hardware on the Xilinx Zynq UltraScale+MPSoc ZCU102FPGA platform. The operating frequency is 180 MHz. The usage of bandwidth for the Convolution and Maxpooling Layer Fusion and Shortcut and Convolution Layer Fusion can be reduced by 85.33% and 45.27%, respectively. While optimizing Maxpooling Layer and Shortcut Layer, the running time is faster than ARM CortaxA53 15 and 26 times, respectively. Furthermore, the realization and the results of the system integration are exhibited through the HDMI monitor.
Computer Science,Engineering
What problem does this paper attempt to address?