Abstract:In this work, we are dedicated to multi-target active object tracking (AOT), where the goal is to achieve continuous tracking of targets through real-time control of camera. This form of active camera control can be applied to unmanned aerial vehicles (UAV), intelligent robots, and sports events. Our work is conducted in an environment featuring multiple cameras and targets, where our goal is to maximize target coverage. Contrasting with previous research, our work introduces additional degrees of freedom for the cameras, allowing them not only to rotate but also to move along boundary lines. In addition, we model the motion of target to predict the future position of the target in environment. With target’s future position, we use Monte Carlo Tree Search (MCTS) method to find the optimal action of camera. Since the action space is large, we propose to leverage the action selection from multi-agent reinforcement learning (MARL) network to prune the search tree of Monte Carlo Tree Search method, so as to find the optimal action more efficiently. We establish a multi-target 2D environment to simulate several sports games, and experimental results demonstrate that our method can effectively improve the target coverage. The code is available at: http://github.com/HopeChanger/ActiveObjectTracking.

What problem does this paper attempt to address?

This paper attempts to address the problem of continuous tracking of multiple targets in Active Object Tracking (AOT) by controlling the camera in real-time. Specifically, the authors focus on how to maximize target coverage in a multi-camera and multi-target environment. Unlike previous studies, this work introduces more degrees of freedom for the camera, allowing not only rotation but also movement along the boundary. Additionally, the authors model the motion of the targets to predict their future positions and use the Monte Carlo Tree Search (MCTS) method to find the optimal camera actions. To improve search efficiency, they propose using a Multi-Agent Reinforcement Learning (MARL) network to select actions for pruning the MCTS search tree. ### Main Contributions of the Paper: 1. **Extended the degrees of freedom for the camera**: Increased the positional freedom of the camera in multi-target AOT, making the camera setup closer to scenarios in actual sports events. 2. **Proposed the MARL network**: Predicted camera actions in a centralized setting, fully utilizing the information between cameras. 3. **Modeled target motion**: Predicted the future state of the environment and combined it with the MCTS method to obtain better actions based on future information. ### Research Background: - **Single-target Active Object Tracking**: Early research mainly focused on single-target tracking, usually divided into two stages: target detection and camera control. In recent years, with the development of deep learning, passive target tracking has made significant progress, but the two-stage method still has bottlenecks in dealing with occlusion and camera control errors. - **Multi-target Active Object Tracking**: When there are multiple targets in the environment, it is difficult for one camera to track all targets simultaneously, thus requiring an evaluation of the coverage of multiple cameras. Related studies include target coverage enhancement in directional sensor networks and the application of multi-agent reinforcement learning in multi-target tracking. ### Method Overview: 1. **2D Environment**: Designed a 2D environment simulating sports event scenarios, where cameras can move along the boundary of the field and rotate freely to achieve real-time tracking. 2. **Position Prediction Module**: Used current and historical observation states to predict future environmental states, providing a basis for Monte Carlo Tree Search. 3. **Multi-Agent Reinforcement Learning Module**: Pruned the search tree by actions output from the MARL network to improve search efficiency. 4. **Monte Carlo Tree Search Module**: Searched for camera actions based on current and future environmental states to find the optimal actions. ### Experimental Results: The authors established a 2D environment to simulate various sports event scenarios. Experimental results show that their method can effectively improve target coverage. ### Conclusion: By introducing more degrees of freedom for the camera, modeling target motion, and using multi-agent reinforcement learning to prune the search tree, this paper effectively addresses the target coverage problem in multi-target active object tracking.

Optimizing Camera Motion with MCTS and Target Motion Modeling in Multi-Target Active Object Tracking

Multi-Target Active Object Tracking with Monte Carlo Tree Search and Target Motion Modeling

Coordinate-Aligned Multi-Camera Collaboration for Active Multi-Object Tracking

MAT: Motion-Aware Multi-Object Tracking

Multiple moving targets tracking from sensor scheduling perspective

Pose-Assisted Multi-Camera Collaboration for Active Object Tracking.

Spatial-Semantic and Temporal Attention Mechanism-Based Online Multi-Object Tracking

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

Multi-object Tracking Within Air-Traffic-Control Surveillance Videos

Real-time Multi-Object Tracking Based on Bi-directional Matching

CSCMOT: Multi-object tracking based on channel spatial cooperative attention mechanism

Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

Multi-Task Structure-Aware Context Modeling for Robust Keypoint-Based Object Tracking

Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking

Online Multi-Target Tracking for Maneuvering Vehicles in Dynamic Road Context

Multi-object tracking algorithm based on interactive attention network and adaptive trajectory reconnection

Dynamic Attention Guided Multi-Trajectory Analysis for Single Object Tracking

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking With Camera-LiDAR Fusion

Multi-Target Multi-Camera Tracking by Tracklet-to-Target Assignment

Online Multi-Object Tracking with Instance-Aware Tracker and Dynamic Model Refreshment