Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification

Yanping Li,Yizhang Liu,Hongyun Zhang,Cairong Zhao,Zhihua Wei,Duoqian Miao
DOI: https://doi.org/10.1109/tip.2024.3393360
IF: 10.6
2024-05-07
IEEE Transactions on Image Processing
Abstract:Person re-identification (ReID) typically encounters varying degrees of occlusion in real-world scenarios. While previous methods have addressed this using handcrafted partitions or external cues, they often compromise semantic information or increase network complexity. In this paper, we propose a new method from a novel perspective, termed as OAT. Specifically, we first use a Transformer backbone with multiple class tokens for diverse pedestrian feature learning. Given that the self-attention mechanism in the Transformer solely focuses on low-level feature correlations, neglecting higher-order relations among different body parts or regions. Thus, we propose the Second-Order Attention (SOA) module to capture more comprehensive features. To address computational efficiency, we further derive approximation formulations for implementing second-order attention. Observing that the importance of semantics associated with different class tokens varies due to the uncertainty of the location and size of occlusion, we propose the Entropy Guided Fusion (EGF) module for multiple class tokens. By conducting uncertainty analysis on each class token, higher weights are assigned to those with lower information entropy, while lower weights are assigned to class tokens with higher entropy. The dynamic weight adjustment can mitigate the impact of occlusion-induced uncertainty on feature learning, thereby facilitating the acquisition of discriminative class token representations. Extensive experiments have been conducted on occluded and holistic person re-identification datasets, which demonstrate the effectiveness of our proposed method.
computer science, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?