Hierarchical Feature Embedding for Visual Tracking

Zhixiong Pi,Weitao Wan,Chong Sun,Changxin Gao,Nong Sang,Chen Li
DOI: https://doi.org/10.1007/978-3-031-20047-2_25
2022-01-01
Abstract:Features extracted by existing tracking methods may contain instance- and category-level information. However, it usually occurs that either instance- or category-level information uncontrollably dominates the feature embeddings depending on the training data distribution, since the two types of information are not explicitly modeled. A more favorable way is to produce features that emphasize both types of information in visual tracking. To achieve this, we propose a hierarchical feature embedding model which separately learns the instance and category information, and progressively embeds them. We develop the instance-aware and category-aware modules that collaborate from different semantic levels to produce discriminative and robust feature embeddings. The instance-aware module concentrates on the instance level in which the inter-video contrastive learning mechanism is adopted to facilitate inter-instance separability and intra-instance compactness. However, it is challenging to force the intra-instance compactness by using instance-level information alone because of the prevailing appearance changes of the instance in visual tracking. To tackle this problem, the category-aware module is employed to summarize high-level category information which remains robust despite instance-level appearance changes. As such, intra-instance compactness can be effectively improved by jointly leveraging the instance- and category-aware modules. Experimental results on various benchmarks demonstrate the proposed method performs favorably against the state-of-the-arts. The code is available on https://github.com/zxgravity/CIA.
What problem does this paper attempt to address?