Multi-layer CNN Features Aggregation for Real-time Visual Tracking

Lijia Zhang,Yanmei Dong,Yuwei Wu
DOI: https://doi.org/10.1109/icpr.2018.8546079
2018-01-01
Abstract:In this paper, we propose a novel convolutional neural network (CNN) based tracking framework, which aggregates multiple CNN features from different layers into a robust representation and realizes real-time tracking. We found that some feature maps have interference for effectively representing objects. Instead of using original features, we build an end-to-end feature aggregation network (FAN) which suppresses the noisy feature maps of CNN layers. The feature significantly benefits to represent objects with both coarse semantic information and fine details. The FAN, as a light-weight network, can run at real-time. The highlighted region of feature maps obtained from the FAN is the tracking result. Our method performs at a real-time speed of 24fps while maintaining a promising accuracy compared with state-of-the-art methods on existing tracking benchmarks.
What problem does this paper attempt to address?