Unsupervised Joint Contrastive Learning for Aerial Person Re-Identification and Remote Sensing Image Classification

Guoqing Zhang,Jiqiang Li,Zhonglin Ye
DOI: https://doi.org/10.3390/rs16020422
IF: 5
2024-01-22
Remote Sensing
Abstract:Unsupervised person re-identification (Re-ID) aims to match the query image of a person with images in the gallery without the use of supervision labels. Most existing methods usually generate pseudo-labels through clustering algorithms for contrastive learning, which inevitably results in noisy labels assigned to samples. In addition, methods that only apply contrastive learning at the clustering level fail to fully consider instance-level relationships between instances. Motivated by this, we propose a joint contrastive learning (JCL) framework for unsupervised person Re-ID. Our proposed method involves creating two memory banks to store features of cluster centroids and instances and applies cluster and instance-level contrastive learning, respectively, to jointly optimize the neural networks. The cluster-level contrastive loss is used to promote feature compactness within the same cluster and reinforce identity similarity. The instance-level contrastive loss is used to distinguish easily confused samples. In addition, we use a WaveBlock attention module (WAM), which can continuously wave feature map blocks and introduce attention mechanisms to produce more robust feature representations of a person without considerable information loss. Furthermore, we enhance the quality of our clustering by leveraging camera label information to eliminate clusters containing single camera captures. Extensive experimental results on two widely used person Re-ID datasets verify the effectiveness of our JCL method. Meanwhile, we also used two remote sensing datasets to demonstrate the generalizability of our method.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively perform Aerial Person Re - Identification and remote - sensing image classification without supervision. Specifically, the paper focuses on how to improve the accuracy and robustness of cross - camera identification of the same person through the Joint Contrastive Learning (JCL) framework in the absence of labeled data. This problem is very important in practical applications because the cost of obtaining a large amount of labeled data is high and time - consuming, which limits the application scope of supervised learning methods. ### Main contributions of the paper 1. **Proposed a Joint Contrastive Learning (JCL) framework**: - Combined clustering - level and instance - level contrastive losses to jointly optimize the model. - Through the instance selection strategy, the reliability of positive samples was improved, which is helpful for model training. 2. **Designed a clustering filtering method**: - Utilized camera label information to eliminate clusters captured only by a single camera, thereby optimizing the clustering results. - Employed the WaveBlock Attention Module (WAM) to directly apply the attention mechanism in different fluctuation regions and extract more discriminative features. 3. **Extensive experimental verification**: - Conducted experiments on two widely - used person re - identification datasets to verify the effectiveness of the method. - Also conducted experiments on two remote - sensing datasets to demonstrate the generalization ability of the method. ### Method overview 1. **Feature extraction module**: - Used ResNet - 50 combined with the WaveBlock module to extract image features from the unlabeled dataset. - Introduced the WaveBlock Attention Module (WAM) to enhance the discriminability of features through non - local blocks. 2. **Clustering optimization module**: - Used the DBSCAN method to cluster the extracted image features. - Through the clustering filtering strategy, eliminated abnormal instances and clusters captured only by a single camera, improving the reliability of clustering. 3. **Joint contrastive learning module**: - Cluster - Level Contrastive Loss (CLL): Used a clustering - level memory bank to store the unique representative features of each cluster and calculated CLL through ClusterNCE loss. - Instance - Level Contrastive Loss (ILL): Designed an instance selection strategy, selected instances with moderate similarity as reliable positive samples, and selected the closest instances in other clusters as negative samples. ### Loss function The total loss function of joint contrastive learning is: \[ L_{\text{Re - ID}}=\mu L_{\text{cs}}+(1 - \mu) L_{\text{is}} \] where \( L_{\text{cs}} \) represents the clustering - level contrastive loss, \( L_{\text{is}} \) represents the instance - level contrastive loss, and the parameter \( \mu \) is a balancing factor with a value between 0 and 1. ### Experimental results The paper conducted experiments on multiple datasets, including Market - 1501, DukeMTMC - reID, PRAI - 1581, and Xiongan New Area. The experimental results show that the proposed JCL method has achieved significant performance improvements on these datasets, especially on unlabeled datasets. ### Conclusion This paper proposes an effective unsupervised joint contrastive learning framework, which solves the key problems in aerial person re - identification and remote - sensing image classification. By combining clustering - level and instance - level contrastive learning and introducing clustering filtering and WaveBlock attention modules, the performance and robustness of the model are significantly improved.