Abstract:This paper proposes CAMOT, a simple camera angle estimator for multi-object tracking to tackle two problems: 1) occlusion and 2) inaccurate distance estimation in the depth direction. Under the assumption that multiple objects are located on a flat plane in each video frame, CAMOT estimates the camera angle using object detection. In addition, it gives the depth of each object, enabling pseudo-3D MOT. We evaluated its performance by adding it to various 2D MOT methods on the MOT17 and MOT20 datasets and confirmed its effectiveness. Applying CAMOT to ByteTrack, we obtained 63.8% HOTA, 80.6% MOTA, and 78.5% IDF1 in MOT17, which are state-of-the-art results. Its computational cost is significantly lower than the existing deep-learning-based depth estimators for tracking.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address two main issues in Multi-Object Tracking (MOT): 1. **Occlusion Problem**: In real-world scenarios, target objects are often occluded by other objects, leading to detection failures. 2. **Inaccurate Distance Estimation in Depth Direction**: When multiple objects are aligned in the depth direction, it is difficult to accurately estimate the distance between them, which may result in incorrect object association between different frames. To solve these problems, the paper proposes a method called CAMOT (Camera Angle-aware Multi-Object Tracking). CAMOT provides depth information for each object by estimating the camera angle, thereby achieving pseudo-3D multi-object tracking. Specifically, CAMOT assumes that multiple objects are located on the plane of each video frame and uses object detection to estimate the camera angle. This not only solves the occlusion problem but also more accurately measures the distance in the depth direction, improving the accuracy of object association between different frames. ### Main Contributions 1. **Lightweight Camera Angle Estimator**: Uses object detection positions to estimate the camera angle. 2. **Frame-to-Frame Object Association Using Camera Angle and Object Depth**: Combines camera angle and object depth information in 2D MOT to improve association accuracy. 3. **Evaluation on Various 2D MOT Methods**: Adds CAMOT to various 2D MOT methods for evaluation, verifying its effectiveness. ### Experimental Results - On the MOT17 dataset, ByteTrack with CAMOT achieved 63.8% HOTA, 80.6% MOTA, and 78.5% IDF1, reaching the current state-of-the-art levels. - The computational cost is significantly lower than existing deep learning-based depth estimators, achieving a speed of 24.92 FPS on a single A100 GPU. ### Conclusion CAMOT effectively addresses the occlusion and depth direction distance estimation problems in multi-object tracking through a simple camera angle estimation method, improving tracking performance with low computational cost, making it suitable for practical applications.

CAMOT: Camera Angle-aware Multi-Object Tracking

MAT: Motion-Aware Multi-Object Tracking

APPTracker Plus : Displacement Uncertainty for Occlusion Handling in Low-Frame-Rate Multiple Object Tracking

Exploit the Connectivity: Multi-Object Tracking with TrackletNet

APPTracker: Improving Tracking Multiple Objects in Low-Frame-Rate Videos

Online Multi-Object Tracking from A Bird's-Eye View by Fusion of Millimeter-Wave Radar and Vision

CAMO-MOT: Combined Appearance-Motion Optimization for 3D Multi-Object Tracking With Camera-LiDAR Fusion

Object-Level Pseudo-3D Lifting for Distance-Aware Tracking

Multi-object tracking with adaptive measurement noise and information fusion

Multimodal Multiobject Tracking by Fusing Deep Appearance Features and Motion Information

Online Multi-Object Tracking Using CNN-based Single Object Tracker with Spatial-Temporal Attention Mechanism

CAMTrack: a combined appearance-motion method for multiple-object tracking

ETTrack: Enhanced Temporal Motion Predictor for Multi-Object Tracking

Real-time Multi-Object Tracking Based on Bi-directional Matching

VariabilityTrack:Multi-Object Tracking with Variable Speed Object Movement

EagerMOT: 3D Multi-Object Tracking via Sensor Fusion

Multi-object tracking algorithm based on interactive attention network and adaptive trajectory reconnection

DepthMOT: Depth Cues Lead to a Strong Multi-Object Tracker

CollabMOT Stereo Camera Collaborative Multi Object Tracking

Spatial-Semantic and Temporal Attention Mechanism-Based Online Multi-Object Tracking

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking