Supplementary Materials for “Learning 3D Shape Feature for Texture-insensitive Person Re-identification”

Jiaxing Chen,Xinyang Jiang,Fudong Wang,Jun Zhang,Feng Zheng,Xing Sun,Wei-Shi Zheng
2021-01-01
Abstract:The supplementary materials of “Learning 3D Shape Feature for Texture-insensitive Person Re-identification” mainly consist of some further evaluations. 1. Evaluation on different 3D features. We carry out experiments to further demonstrate the effectiveness of our 3D shape feature learning. We make comparison with: (1) the 3D reconstruction based features [2, 3]; (2) the 3D based ReID model OG-Net [5]. In Table A, the “HMR based feature” indicates 3D features learned by directly adding ReID losses to the original HMR model [2]. The “SPIN based feature F1” is the output of res4 in SPIN [3] without any ReID supervision. The “SPIN based feature F2+supervision” is the feature trained by using subnetworks (the res4 block in Resnet[1]) and identity labels upon the output of res3 in SPIN [3]. Although the architecture of our 3D branch is similar with the HMR model [2], some important modifications are made to fit the original 3D reconstruction architecture to the ReID feature learning. We decouple the pose and shape branches to thoroughly separate poseand shape-related features, since the pose-related feature is an interference factor for person ReID and only shape feature is identityspecific. As shown in Table A, directly using HMR [2] as the 3D reconstruction regularizer achieves unsatisfactory performance, which validates that the separate modeling of pose and shape information is important for discriminative shape-related feature learning. # Equal Contribution. * Corresponding Author. Table A. Performance (%) of different 3D features. Models PRCC Market1501 rank-1 rank-5 rank-1 mAP Our 3D Shape feature 49.2 84.2 93.2 83.7 HMR [2] based feature 38.8 70.6 88.4 76.5 SPIN based feature F1 [3] 5.3 14.9 1.8 0.6 SPIN basd feature F2 [3] + supervision 31.7 53.9 73.0 48.9 OG-Net [5] 20.4 45.8 85.9 66.9 We also make comparison with the state-of-the-art 3D reconstruction model SPIN [3]. The performances of “SPIN based feature F1” and “SPIN based feature F2+supervision” are significantly lower than our model. This reveals that directly using the pretrained 3D reconstruction weights for ReID might be limited by the task gap and the poor discriminative ability. Consequently, our ReID oriented customized design on the 3D model is essential. Besides, without the specialized mechanism for clothing texture-insensitive feature learning, the 3D based ReID model [5] achieves poor performance on the data covering clothing change situations. 2. Evaluation on the displacement. Our motivation to introduce free-form displacements is that the low-dimension shape parameters of SMPL [4] could not represent human body shape intactly and we expect the estimation of displacement to help capture shape representation covering more identity-specific characteristics. To examine the effectiveness of displacement, we use the version without MGS for fair comparison, that are only global features to separately estimate shape parameters and global displacements, which is indicated as “with displacement” in Table B. The version “without displacement” indicates that only a single global feature to estimate shape parameters. However, we also do not want the 3D model
What problem does this paper attempt to address?