A 1096fps Hardware Architecture For Fast Training In Object Tracking

Yun Lv,Huiyu Mo,Leibo Liu,Shouyi Yin,Shaojun Wei,Wenping Zhu,Qiang Li
DOI: https://doi.org/10.1109/iccsn.2019.8905251
2019-01-01
Abstract:In recent years, Discriminative Correlation Filter based methods have significantly outperformed the state-of-the art in tracking accuracy. However, the high-complexity training process makes it hard for the tracking task to keep both high accuracy and speed. In this work, the training algorithm is optimized to significantly reduce its computation with acceptable accuracy loss. Then a dedicated hardware is designed to further accelerate the training process with high accuracy. First, time constraints are released to turn serial module into parallel module; Second, the symmetry and sparsity of regularization filter kernel is utilized to reduce 80% computation of regularization convolution; Third, the computation of inner product module in training is reduced by turning complex numbers calculations into real and imaginary numbers calculations respectively. In conclusion, about 24.19% computation of training process is reduced and 4.30% parallel processing time is saved to get a 1.32x hardware resources improvement and 1.05x speedup than the original process. The simulation results show that the throughput of this hardware achieves 1096fps at 250 MHz, which is especially suitable for tracking tasks with high speed and accuracy requirement.
What problem does this paper attempt to address?