Reconciling global and local optimal label assignments for heavily occluded pedestrian detection

DOI: https://doi.org/10.1007/s00530-024-01304-0
IF: 3.9
2024-03-30
Multimedia Systems
Abstract:Heavily occluded pedestrian detection remains challenging for CNN detectors. Recent methods such as OTA and simOTA utilize optimal transport for label assignment but still encounter limitations in handling local occlusion. To tackle this issue, we thoroughly investigate the relationship between data assignment algorithms and the label assignment problem. We propose a theoretical framework to explain the underlying causes of suboptimal label assignments in heavily occluded regions and identify the ideal assignment method. In our pursuit of the ideal method, we propose two label assignment methods: the K-means method (KMM) and the LAPJV method (LAM), which correspond to the Clustering Algorithm and the Linear Assignment Problem, respectively. KMM assigns anchors based on the lowest cost, similar to K-means clustering. LAM applies LAPJV iteratively on occluded regions for local optimization, and maintains global optimality in non-occluded regions. LAM also achieves 30% execution time reduction compared to OTA. We provide both theoretical analysis and experimental validation to demonstrate that LAM is the ideal method in our theoretical framework. It elegantly reconciles global and local optimal assignments efficiently, thus achieving the highest performance in Average Precision (AP) and Recall on five datasets, i.e., CrowdHuman, WiderPerson, CityPersons, COCOPersons, and COCO.
computer science, information systems, theory & methods
What problem does this paper attempt to address?