Abstract:Recent advancements in the field of visual tracking have been propelled by the amalgamation of Siamese networks and region proposal networks, which have demonstrated excellent competitive accuracy while remaining computationally efficient. However, these approaches often suffer from excessive parametric redundancy and additional computational costs owing to the use of anchor boxes or multiscale pyramids, leaving room for improvement in their performance. In this study, a novel extreme point graph-guided tracking approach called SiamEXTR is presented, which tracks generic target objects by detecting five keypoints , including four target-specific extreme points (i.e., the topmost, bottommost, leftmost, and rightmost points) and a center point, without requiring region classification or bounding box regression. To enhance the robustness of the extreme point detection, a new oriented pooling strategy called extreme-pooling is proposed, which captures more recognizable discriminative global and local information to help a pixel predict its category. In addition, a U-shaped backbone network is designed to preserve fine-grained visual details and stronger semantic information at high-resolution, ensuring that the detection granularity of the extreme point graph is as close to the subpixel-level as possible. Based on the detected extreme point graph, the proposed approach not only generates axis-aligned bounding boxes for object annotation, but also provides more accurate octagonal object segmentation masks through a simple approximation strategy. Without bells and whistles, extensive experiments and comparisons on several authoritative large-scale benchmark datasets demonstrated that the SiamEXTR tracker consistently achieved competitive performance, with running speeds significantly exceeding 140 frames per second. The authors hope that the concept behind this approach will serve as a new baseline and promote further development of the visual tracking community.

Siamese Refine Polar Mask Prediction Network for Visual Tracking

PointSiamRCNN: Target-aware Voxel-based Siamese Tracker for Point Clouds

Siamese Centerness Prediction Network for Real-Time Visual Object Tracking

Residual Attention SiameseRPN for Visual Tracking

SiamCPN: Visual tracking with the Siamese center-prediction network

SiamCorners: Siamese Corner Networks for Visual Tracking

High Performance Visual Tracking with Siamese Region Proposal Network

Siamese Residual Network for Efficient Visual Tracking

Robust visual tracking with extreme point graph-guided annotation: Approach and experiment

SiamMask: A Framework for Fast Online Object Tracking and Segmentation

Discriminative and Robust Online Learning for Siamese Visual Tracking

IoU-guided Siamese region proposal network for real-time visual tracking

R-SiamNet: ROI-Align Pooling Baesd Siamese Network for Object Tracking

Learning Motion-Perceive Siamese network for robust visual object tracking

Adaptive Siamese Tracking with a Compact Latent Network

SiamRCR: Reciprocal Classification and Regression for Visual Object Tracking

Object Tracking Algorithm Based on Channel-interconnection-spatial Attention Mechanism and Siamese Region Proposal Network

Siamese Tracking Network with Spatial-Semantic-Aware Attention and Flexible Spatiotemporal Constraint

SiamON: Siamese Occlusion-aware Network for Visual Tracking

SiamBAN: Target-Aware Tracking With Siamese Box Adaptive Network

SiamRDT: An Object Tracking Algorithm Based on a Reliable Dynamic Template