Person Re-Identification Using Maintain Translation Invariance and F-triplet Loss.

Shoukang Ma,Chunjie Zhang
DOI: https://doi.org/10.1145/3633637.3633683
2024-01-01
Abstract:Pedestrian re-identification involves matching pedestrian images or videos across different cameras. The goal is to retrieve the same pedestrian given a query image from a gallery of images captured by various surveillance devices. As a cross-view task, the spatial position of individuals in images often changes for pedestrian re-identification. Maintaining spatial position invariance is crucial for pedestrian re-identification. Besides, focusing on pedestrians and minimizing the influence of background noise remain unresolved issues in pedestrian re-identification. Deep learning models are often used, however, there is a low-frequency preference phenomenon. Deep learning models tend to learn low-frequency features during the training process, neglecting high-frequency features. Hence, simultaneously learning high and low-frequency features is particularly important in pedestrian re-identification tasks. This paper proposes the design of a novel module to maintain translational invariance in the network architecture. The original loss function is modified using the Fourier transform to better learn high-frequency and low-frequency features. By concurrently learning high and low-frequency features, the specific region of the pedestrian in the image can be accurately determined, thereby improving the accuracy of pedestrian re-identification. In this way, we can improve robustness of pedestrian re-identification in complex cross-view scenarios. Superior experimental performance is achieved on two large-scale Market-1501 and DukeMTMC datasets.
What problem does this paper attempt to address?