Consistency Constraints and Label Optimization-Relevant Domain-Unsupervised Adaptive Pedestrians're-Identification
Wu Yuhang,Sang Nong
DOI: https://doi.org/10.11834/jig.220618
2023-01-01
Journal of Image and Graphics
Abstract:Objective Pedestrians’ re-identification(Re-ID) can be as one of key techniques for multi-camera pedestrians’ retrieval in the video surveillance system. To achieve good performance, current Re-ID model is often trained on a scene based on a large number of annotations(source domain), but the performance will be dropped significantly when a new scene(target domain) is applied straightforward. However, re-labeling is time-consuming and labor-intensive for the new scene, which is beneficial for Re-ID-related optimization. The unsupervised domain adaptive pedestrians’ re-identification(UDA Re-ID) method is focused on a model training, which can be generalized on the target domain well using the existing source domain data-labeled and target domain data-unlabeled. But, these methods are still challenged for the instability of instance features and image heterogeneity of the intra-class distance-wider and the inter-class distancenarrowed. Furthermore, current cluster unlabeled target domain data can be melted into multiple clusters, and the encoded pseudo labels can be assigned for each cluster. However, due to the limited representation ability of the model, the clustering results are incredible, especially in the early stage of training. One pedestrian-related image is grouped into different clusters while some images of different pedestrian are merged into a cluster, called pseudo label noises. However, it is still challenged for the problem of pseudo label noises-over-fitted and the performance of the model-suppressed although pseudo labels are recognized as a supervision signal for the feature learning process(e. g., contrastive learning). To resolve these problems, we develop a multi-centroid representation network with consistency constraints method(MCRNCC) in terms of the popular multi-centroid representation network method(MCRN).Method The MCRNCC is designed on the basis of three MCRN-related modules to improve the stability of instance features and the robustness of pedestrian features, and the overfitting risk of the pseudo-label noise can be reduced. First, to optimize the instance feature stability and semantic information, an instance-consistent is demonstrated to suppress the feature distance of the same instance under different augmentation. The exponential moving average model is illustrated to output additional features based on recent selfsupervised learning works. For each image of the training batch, it can be augmented twice in random, the features can be extracted in relevance to original model and exponential moving average model, and cosine distance is used to constrain the feature pairs. Second, to improve robustness of multiple variations-captured, its homogeneity is concerned for suppressing the distance between feature pairs of positive instances. Specifically, two instances are opted to construct a positive pair in the context of same labels-without identity label, and two instances is opted as well to construct a negative pair in related to same label-within multiple identity labels, and a triplet is built up to optimize the network as well. Finally, the labelensemble-based optimization is carried out to convert one-hot encoded pseudo labels into more reliable soft labels, which improves the robustness of supervision signals. In detail, we add a target domain classifier to generate additional label predictions, followed by linearly weighting the predictions and one-hot encoded pseudo labels into refined soft labels.Result To verify the effectiveness of our method, adequate experiments are carried out on 4 popular UDA Re-ID tasks like 1) Duke→Market, 2) Market→Duke, 3) Duke→MSMT, and 4) Market→MSMT tasks. At the beginning, the ablation studies are carried out about the modules in MCRNCC. The four tasks-derived instance consistency constraints can be reached to 0. 6%, 0. 2%, 0. 7% and 0. 8% of each mean average precision(mAP), which demonstrates the effectiveness of the instance consistency constraint. The camera consistency constraint yields a general improvement for all of 4 tasks.For example, mAP/Rank-1 is increased by 3. 5%/3. 2% and 5. 3%/4. 7% on Duke→MSMT and Market→MSMT. In addition, we visualize the feature space of adding camera consistency constraint before and after. Furthermore, we compare the feature space of some pedestrian-focused close to the camera consistency constraint, and the visualization results show that the camera consistency can make the feature space more compactible. The label-ensemble-optimized can be improved to 0. 6%, 0. 6%, 1. 4% and 0. 4% of the mAP for each 4 tasks. Second, our proposed MCRNCC is compared to the existing methods. The comparative analysis shows that the MCRNCC can be reached to 85. 0%/94. 0%, 73. 5%/85. 6%, 41. 3%/71. 6% and 39. 3%/69. 5% for the optimization of mAP/Rank-1 performance, and the MCRN is surpassed by 1. 2%/0. 2%, 2. 0%/1. 1%, 5. 6%/4. 1%, and 6. 5%/5. 1% as well.Conclusion we develop a method MCRNCC to resolve the UDA Re-ID problem further. The instance consistency constraint and camera consistency constraint proposed in MCRNCC can enable the model to learn more robust pedestrian-related feature representations, while the proposed label ensemble-based optimization can reduce the overfitting risks of pseudo label noises. Experiments show that the effectiveness of threemodule based MCRNCC has its potentials for future works.