RGB-T Object Tracking:Benchmark and Baseline

Chenglong Li,Xinyan Liang,Yijuan Lu,Nan Zhao,Jin Tang
DOI: https://doi.org/10.48550/arXiv.1805.08982
2018-05-23
Abstract:RGB-Thermal (RGB-T) object tracking receives more and more attention due to the strongly complementary benefits of thermal information to visible data. However, RGB-T research is limited by lacking a comprehensive evaluation platform. In this paper, we propose a large-scale video benchmark dataset for RGB-T <a class="link-external link-http" href="http://tracking.It" rel="external noopener nofollow">this http URL</a> has three major advantages over existing ones: 1) Its size is sufficiently large for large-scale performance evaluation (total frame number: 234K, maximum frame per sequence: 8K). 2) The alignment between RGB-T sequence pairs is highly accurate, which does not need pre- or post-processing. 3) The occlusion levels are annotated for occlusion-sensitive performance analysis of different tracking <a class="link-external link-http" href="http://algorithms.Moreover" rel="external noopener nofollow">this http URL</a>, we propose a novel graph-based approach to learn a robust object representation for RGB-T tracking. In particular, the tracked object is represented with a graph with image patches as nodes. This graph including graph structure, node weights and edge weights is dynamically learned in a unified ADMM (alternating direction method of multipliers)-based optimization framework, in which the modality weights are also incorporated for adaptive fusion of multiple source <a class="link-external link-http" href="http://data.Extensive" rel="external noopener nofollow">this http URL</a> experiments on the large-scale dataset are executed to demonstrate the effectiveness of the proposed tracker against other state-of-the-art tracking methods. We also provide new insights and potential research directions to the field of RGB-T object tracking.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use RGB - Thermal (RGB - T) data for robust object tracking under complex environmental conditions. Specifically, the paper focuses on several key issues in existing RGB - T object tracking research: 1. **Lack of a comprehensive evaluation platform**: Existing RGB - T object tracking research is limited by the lack of an integrated evaluation platform, which makes it difficult to conduct a comprehensive and systematic comparison of different algorithms. 2. **Limited dataset scale**: Existing RGB - T datasets are small in scale and cannot provide enough samples for large - scale performance evaluation, limiting the testing and optimization of algorithms. 3. **Inaccurate modality alignment**: The alignment between RGB and thermal imaging videos in existing datasets is not accurate enough, requiring additional pre - processing or post - processing steps, which increases the difficulty of use. 4. **Lack of occlusion annotation**: Existing datasets lack detailed annotations of occlusion levels, which affects the performance analysis of different algorithms in occlusion situations. To solve these problems, the paper makes the following contributions: - **Constructed a large - scale RGB - T benchmark dataset**: This dataset contains 234 video sequences, with a total number of frames of approximately 234K, and the maximum number of frames in a single sequence is 8K. The dataset has high - precision modality alignment, does not require pre - processing or post - processing, and is annotated with occlusion levels, supporting detailed performance analysis of different algorithms. - **Proposed a graph - based learning method**: This method learns a robust feature representation of the target object by constructing a dynamic graph model. Nodes in the graph model represent image patches, edges represent relationships between nodes, and node weights and edge weights reflect the importance of each image patch and its relationship with other patches. In addition, modality weights are introduced to adaptively fuse multi - source data. - **Conducted extensive experimental verification**: A large number of experiments were carried out on a large - scale benchmark dataset to verify the effectiveness of the proposed method and provide new insights and future research directions. Through these contributions, the paper aims to promote the research progress in the field of RGB - T object tracking, especially the improvement of robustness under complex environmental conditions.