D-VAT: End-to-End Visual Active Tracking for Micro Aerial Vehicles

Alberto Dionigi,Simone Felicioni,Mirko Leomanni,Gabriele Costante

DOI: https://doi.org/10.1109/LRA.2024.3385700

2024-04-07

Abstract:Visual active tracking is a growing research topic in robotics due to its key role in applications such as human assistance, disaster recovery, and surveillance. In contrast to passive tracking, active tracking approaches combine vision and control capabilities to detect and actively track the target. Most of the work in this area focuses on ground robots, while the very few contributions on aerial platforms still pose important design constraints that limit their applicability. To overcome these limitations, in this paper we propose D-VAT, a novel end-to-end visual active tracking methodology based on deep reinforcement learning that is tailored to micro aerial vehicle platforms. The D-VAT agent computes the vehicle thrust and angular velocity commands needed to track the target by directly processing monocular camera measurements. We show that the proposed approach allows for precise and collision-free tracking operations, outperforming different state-of-the-art baselines on simulated environments which differ significantly from those encountered during training. Moreover, we demonstrate a smooth real-world transition to a quadrotor platform with mixed-reality.

Computer Science

What problem does this paper attempt to address?

This paper proposes a new approach called D-V AT (Deep Visual Active Tracking) specifically for end-to-end visual active tracking tasks in Micro Aerial Vehicles (MAVs). In visual active tracking, the robot not only needs to detect the target but also needs to control its own movement to keep the target in the field of view. Traditional methods usually design separate perception and control modules, while D-V AT utilizes Deep Reinforcement Learning (DRL) to directly calculate the required thrust and angular velocity commands from monocular camera measurements. Different from most works focusing on ground robots, D-V AT deals with aerial platforms such as micro drones, which requires more sophisticated strategies to learn. The paper points out that existing methods often ignore vehicle dynamics or limit possible actions, resulting in inadequate performance and robustness. D-V AT addresses these issues by not requiring strict assumptions about the motion of the target or the tracker, and directly maps from RGB images to continuous control commands. The paper demonstrates the superior performance of D-V AT in a simulated environment compared to model-based and data-driven state-of-the-art benchmarks, and successfully deploys it on a real quadcopter without fine-tuning, proving its generalization ability in different environments. Experimental results show that D-V AT surpasses existing baseline methods in terms of tracking accuracy and robustness.

D-VAT: End-to-End Visual Active Tracking for Micro Aerial Vehicles

Fast-Tracker 2.0: Improving Autonomy of Aerial Tracking with Active Vision and Human Location Regression.

Enhancing Continuous Control of Mobile Robots for End-to-End Visual Active Tracking

Vision-based Relative Detection and Tracking for Teams of Micro Aerial Vehicles

AD-VAT+: an Asymmetric Dueling Mechanism for Learning and Understanding Visual Active Tracking.

A Vision-based UAV Tracker Aiming at Aerial Targets

Unifying Foundation Models with Quadrotor Control for Visual Tracking Beyond Object Categories

Real-Time Visual Tracking of Moving Targets Using a Low-Cost Unmanned Aerial Vehicle with a 3-Axis Stabilized Gimbal System

A Cross-Scene Benchmark for Open-World Drone Active Tracking

Space Non-cooperative Object Active Tracking with Deep Reinforcement Learning

PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Autonomous navigation of micro aerial vehicles using high-rate and low-cost sensors

Active Perception Based Formation Control for Multiple Aerial Vehicles

Leveraging Event Streams with Deep Reinforcement Learning for End-to-End UAV Tracking

UAV-based autonomous detection and tracking of beyond visual range (BVR) non-stationary targets using deep learning

Vision-Based Topological Localization for MAVs

Effective Target Aware Visual Navigation for UAVs

AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization

Probabilistic 3D motion model for object tracking in aerial applications

Active Object Detection and Tracking Using Gimbal Mechanisms for Autonomous Drone Applications

Differential GNSS and Vision-Based Tracking to Improve Navigation Performance in Cooperative Multi-UAV Systems