Optimized Parallel Implementation of Face Detection Based on Embedded Heterogeneous Many-Core Architecture

Fang Gao,Zhangqin Huang,Shulong Wang,Xinrong Ji
DOI: https://doi.org/10.1142/s0218001417560110
IF: 1.261
2017-01-01
International Journal of Pattern Recognition and Artificial Intelligence
Abstract:Computing performance is one of the key problems in embedded systems for high-resolution face detection applications. To improve the computing performance of embedded high-resolution face detection systems, a novel parallel implementation of embedded face detection system was established based on a low power CPU-Accelerator heterogeneous many-core architecture. First, a basic CPU version of face detection prototype was implemented based on the cascade classifier and Local Binary Patterns operator. Second, the prototype was extended to a specified embedded parallel computing platform that is called Parallella and consists of Xilinx Zynq and Adapteva Epiphany. Third, the face detection algorithm was optimized to adapt to the Parallella architecture to improve the detection speed and the utilization of computing resources. Finally, a face detection experiment was conducted to evaluate the computing performance of the proposal in this paper. The experimental results show that the proposed implementation obtained a very consistent accuracy as that of the dual-core ARM, and achieved 7.8 times speedup than that of the dual-core ARM. Experiment results prove that the proposed implementation has significant advantages on computing performance.
What problem does this paper attempt to address?