FPGA-Based Parallel Implementation of SURF Algorithm
Wenjie Chen,Shuaishuai Ding,Zhilei Chai,Daojing He,Weihua Zhang,Guanhua Zhang,Qiwei Peng,Wang Luo
DOI: https://doi.org/10.1109/ICPADS.2016.0049
2016-01-01
Abstract:SURF (Speeded up robust features) detection is used extensively in object detection, tracking and matching. However, due to its high complexity, it is usually a challenge to perform such detection in real time on a general-purpose processor. This paper proposes a parallel computing algorithm for the fast computation of SURF, which is specially designed for FPGAs. By efficiently exploiting the advantages of the architecture of an FPGA, and by appropriately handling the inherent parallelism of the SURF computation, the proposed algorithm is able to significantly reduce the computation time. Our experimental results show that, for an image with a resolution of 640x480, the processing time for computing using SURF is only 0.047 seconds on an FPGA (XC6SLX150T, 66.7 MHz), which is 13 times faster than when performed on a typical i3-3240 CPU (with a 3.4 GHz main frequency) and 249 times faster than when performed on a traditional ARM system (CortexTM-A8, 1 GHz).