Deep Cross-Modality Alignment for Multi-Shot Person Re-IDentification

Zhichao Song,Bingbing Ni,Yichao Yan,Zhe Ren,Yi Xu,Xiaokang Yang
DOI: https://doi.org/10.1145/3123266.3123324
2017-01-01
Abstract:Multi-shot person Re-IDentification (Re-ID) has recently received more research attention as its problem setting is more realistic compared to single-shot Re-ID in terms of application. While many large-scale single-shot Re-ID human image datasets have been released, most existing multishot Re-ID video sequence datasets containonly a few (i.e., several hundreds) human instances, which hinders further improvement of multi-shot Re-ID performance. To this end, we propose a deep cross-modality alignment network, which jointly explores both human sequence pairs and image pairs to facilitate training better multi-shot human Re-ID models, i.e., via transferring knowledge from image data to sequence data. To mitigate modality-to-modality mismatch issue, the proposed network is equipped with an image-to-sequence adaption module called cross-modality alignment sub-network, which successfully maps each human image into a pseudo human sequence to facilitate knowledge transferring and joint training. Extensive experimental results on several multi-shot person Re-ID benchmarks demonstrate great performance gain brought up by the proposed network.
What problem does this paper attempt to address?