Abstract:Drone aerial imaging has become increasingly important across numerous fields as drone optical sensor technology continues to advance. One critical challenge in this domain is achieving both accurate and efficient multi-object tracking. Traditional deep learning methods often separate object identification from tracking, leading to increased complexity and potential performance degradation. Conventional approaches rely heavily on manual feature engineering and intricate algorithms, which can further limit efficiency. To overcome these limitations, we propose a novel Transformer-based end-to-end multi-object tracking framework. This innovative method leverages self-attention mechanisms to capture complex inter-object relationships, seamlessly integrating object detection and tracking into a unified process. By utilizing end-to-end training, our approach simplifies the tracking pipeline, leading to significant performance improvements. A key innovation in our system is the introduction of a trajectory detection label matching technique. This technique assigns labels based on a comprehensive assessment of object appearance, spatial characteristics, and Gaussian features, ensuring more precise and logical label assignments. Additionally, we incorporate cross-frame self-attention mechanisms to extract long-term object properties, providing robust information for stable and consistent tracking. We further enhance tracking performance through a newly developed self-characteristics module, which extracts semantic features from trajectory information across both current and previous frames. This module ensures that the long-term interaction modules maintain semantic consistency, allowing for more accurate and continuous tracking over time. The refined data and stored trajectories are then used as input for subsequent frame processing, creating a feedback loop that sustains tracking accuracy. Extensive experiments conducted on the VisDrone and UAVDT datasets demonstrate the superior performance of our approach in drone-based multi-object tracking.

Multiple object detection and tracking from drone videos based on GM-YOLO and multi-tracker

Online Multi-Object Tracking from A Bird's-Eye View by Fusion of Millimeter-Wave Radar and Vision

Exploit the Connectivity: Multi-Object Tracking with TrackletNet

Multi-Object Tracking Meets Moving UAV

GIAOTracker: A comprehensive framework for MCMOT with global information and optimizing strategies in VisDrone 2021

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

Multi-features Guided Robust Visual Tracking.

Multi-Drone-Based Single Object Tracking With Agent Sharing Network

Multiple Object Tracking of Drone Videos by a Temporal-Association Network with Separated-Tasks Structure

A Multi-Object Tracker Using Dynamic Bayesian Networks and a Residual Neural Network Based Similarity Estimator.

Yolo-3DMM for Simultaneous Multiple Object Detection and Tracking in Traffic Scenarios

Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking

Multi-Object Vehicle Detection and Tracking Algorithm Based on Improved YOLOv8 and ByteTrack

End-to-end multiple object tracking in high-resolution optical sensors of drones with transformer models

Object Tracking in Unmanned Aerial Vehicle Videos via Multifeature Discrimination and Instance-Aware Attention Network

Robust Multi-Drone Multi-Target Tracking to Resolve Target Occlusion: A Benchmark

Prevention of deep vein thrombosis and pulmonary embolism following surgery.

Multi-object tracking via deep feature fusion and association analysis

Multiple object tracking with appearance feature prediction and similarity fusion

Multiple Object Tracking in Satellite Video With Graph-Based Multiclue Fusion Tracker

MM-Tracker: Visual Tracking with A Multi-Task Model Integrating Detection and Differentiating Feature Extraction