Efficient Hardware Post Processing of Anchor-Based Object Detection on FPGA

Hui Zhang,Wei Wu,Yufei Ma,Zhongfeng Wang
DOI: https://doi.org/10.1109/isvlsi49217.2020.00089
2020-01-01
Abstract:Object detection has been widely adopted in video analysis and image understanding. Anchor-based object detection has achieved good performance on the scale variation that is one long-standing problem for object detection. The postprocessing is an essential step of anchor-based object detection after convolutional neural networks (CNN) and it requires long computation time on CPU or GPU. In this paper, we propose an efficient FPGA solution using fixed-point representation for the postprocessing. The quantization error of fixed-point representation is mainly from the sigmoid function and the exponent function. In order to reduce the error, we implement the sigmoid function and exponent function on FPGA respectively employing piecewise non-linear approximation and "LUT and shifting" method. The performance of both functions has been demonstrated to realize 10-4 accuracy. In addition, the Non-Maximum Suppression (NMS) is also employed and implemented to reduce redundant objects bounding boxes. Based on these, a fast and resource-efficient accelerator for postprocessing is implemented on Intel Arria 10 FPGA. By using only about 1% of the FPGA hardware resources, our design can achieve about 111×, 50×, and 290× speedup compared to the software implementations on desktop CPU, GPU, and the embedded CPU inside FPGA, respectively.
What problem does this paper attempt to address?