A Fast and Power-Efficient Hardware Architecture for Non-Maximum Suppression

Man Shi,Peng Ouyang,Shouyi Yin,Leibo Liu,Shaojun Wei
DOI: https://doi.org/10.1109/tcsii.2019.2893527
2019-01-01
Abstract:Non-maximum suppression (NMS) is an indispensable post-processing step in face detection. The vast majority of face detection methods need NMS to merge the candidate detected face boxes that belong to the same face. However, the standard NMS is a greedy and local optimization technique which suffers from several shortcomings, such as high complexity ), high latency, and large power consumption. This brief alleviates these problems and presents an efficient hardware architecture for NMS, meanwhile, carries out the optimization for the calculation unit to achieve the reduction of area accordingly. Based on the multi-thread computing, this brief utilizes sliding window to obtain parallelism and uses position-based bit table technique for the enhancement of data accessing and data reusing, which greatly decreases the cost of memory access and power consumption. The proposed hardware architecture is implemented in TSMC 28-nm technology. Experiments show that the power consumption is 6.142 mW and the latency is 12.79 to cluster 1000 candidate boxes, whose energy efficiency is higher than those state-of-the-art methods by 3798 and 358, respectively.
What problem does this paper attempt to address?