Abstract:This report presents our method for Single Object Tracking (SOT), which aims to track a specified object throughout a video sequence. We employ the LoRAT method. The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking. We train our model using the extensive LaSOT and GOT-10k datasets, which provide a solid foundation for robust performance. Additionally, we implement the alpha-refine technique for post-processing the bounding box outputs. Although the alpha-refine method does not yield the anticipated results, our overall approach achieves a score of 0.813, securing first place in the competition.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the performance improvement and model optimization in the Single Object Tracking (SOT) task. Specifically, the research team hopes to track the specified object more accurately in video sequences by improving the structure and training methods of existing models. To achieve this goal, they adopted the following strategies: 1. **Application of LoRA technology**: LoRA (Low - Rank Adaptation) is a fine - tuning technology that reduces inference latency by adjusting only a small part of the model's parameters. The research team mentioned in the paper applied LoRA technology to the field of visual tracking to improve the adaptability and efficiency of the model. 2. **Use of large - scale datasets**: In order to enhance the robustness and generalization ability of the model, the research team used two large - scale datasets, LaSOT and GOT - 10k, for training. These datasets provide diverse scenarios, which help the model better cope with complex situations in practical applications. 3. **Alpha - Refine post - processing technology**: Although the Alpha - Refine method aims to improve tracking performance through accurate bounding box estimation, it did not achieve the expected effect in practical applications. However, the research team still tried it as one of the post - processing steps. Finally, the method proposed by this research team achieved remarkable results in the Perception Test Challenge 2024 competition, obtaining an average IoU (Intersection over Union) score of 0.813 and winning first place. ### Formula representation The formulas involved in the paper are mainly related to evaluation metrics, such as the average Intersection over Union (average IoU), and its calculation formula is as follows: \[ \text{average IoU}=\frac{1}{N}\sum_{i = 1}^{N}\frac{\text{Area of Overlap}}{\text{Area of Union}} \] where: - \(N\) is the number of tracked objects; - \(\text{Area of Overlap}\) represents the area of the overlapping region between the predicted bounding box and the ground - truth bounding box; - \(\text{Area of Union}\) represents the area of the union region between the predicted bounding box and the ground - truth bounding box. Through this formula, the performance of the model in the tracking task can be quantified.

The Solution for Single Object Tracking Task of Perception Test Challenge 2024

RTrack: Accelerating Convergence for Visual Object Tracking via Pseudo-Boxes Exploration

The Visual Object Tracking VOT2014 Challenge Results

Beyond Local Search: Tracking Objects Everywhere with Instance-Specific Proposals

The Visual Object Tracking VOT2013 Challenge Results

Toward Accurate Pixelwise Object Tracking via Attention Retrieval

LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

SOT for MOT

LaSOT: A High-quality Large-scale Single Object Tracking Benchmark

Single Object Tracking in Satellite Videos Based on Feature Enhancement and Multi-Level Matching Strategy

SRRT: Exploring Search Region Regulation for Visual Object Tracking

Solution for Point Tracking Task of ECCV 2nd Perception Test Challenge 2024

Rethinking the Competition Between Detection and ReID in Multiobject Tracking

NeighborTrack: Improving Single Object Tracking by Bipartite Matching with Neighbor Tracklets

UHP-SOT: An Unsupervised High-Performance Single Object Tracker

OmniTracker: Unifying Object Tracking by Tracking-with-Detection

Object Preserving Siamese Network for Single Object Tracking on Point Clouds

1st Place Solutions of Waymo Open Dataset Challenge 2020 -- 2D Object Detection Track

SiamRDT: An Object Tracking Algorithm Based on a Reliable Dynamic Template

Visual Object Tracking With Mutual Affinity Aligned to Human Intuition