Exploiting Multigranular Salient Features with Hierarchical Multi-Mode Attention Network for Pedestrian Re-Identification

Yanbing Geng,Yongjian Lian,Mingliang Zhou,Yixue Kong,Yinong Zhu
DOI: https://doi.org/10.1016/j.jvcir.2020.102914
IF: 2.887
2020-01-01
Journal of Visual Communication and Image Representation
Abstract:In this paper, we propose an end-to-end hierarchical-based multi-mode attention network and adaptive fusion (HMAN-HAF) strategy to learn different-level salient features for re-ID tasks. First, according to each layer’s characteristics, a hierarchical multi-mode attention network (HMAN) is designed to adopt different attention models for different-level salient feature learning. Specifically, refined channel-wise attention (CA) is adopted to capture high-level valuable semantic information, an attentive region model (AR) is used to detect salient regions in the low layer, and fused attention (FA) is designed to capture the salient regions of valuable channels in the middle layer. Second, a hierarchical adaptive fusion (HAF) is constructed to fulfill the complementary strengths of different-level salient features. Experimental results demonstrate that the proposed method outperforms the state-of-the-art methods on the following challenging benchmarks: Market-1501, DukeMTMC-reID and CUHK03.
What problem does this paper attempt to address?