Unity in Diversity: Multi-expert Knowledge Confrontation and Collaboration for Generalizable Vehicle Re-identification

Zhenyu Kuang,Hongyang Zhang,Lidong Cheng,Yinhao Liu,Yue Huang,Xinghao Ding
2024-07-10
Abstract:Generalizable vehicle re-identification (ReID) aims to enable the well-trained model in diverse source domains to broadly adapt to unknown target domains without additional fine-tuning or retraining. However, it still faces the challenges of domain shift problem and has difficulty accurately generalizing to unknown target domains. This limitation occurs because the model relies heavily on primary domain-invariant features in the training data and pays less attention to potentially valuable secondary features. To solve this complex and common problem, this paper proposes the two-stage Multi-expert Knowledge Confrontation and Collaboration (MiKeCoCo) method, which incorporates multiple experts with unique perspectives into Contrastive Language-Image Pretraining (CLIP) and fully leverages high-level semantic knowledge for comprehensive feature representation. Specifically, we propose to construct the learnable prompt set of all specific-perspective experts by adversarial learning in the latent space of visual features during the first stage of training. The learned prompt set with high-level semantics is then utilized to guide representation learning of the multi-level features for final knowledge fusion in the next stage. In this process of knowledge fusion, although multiple experts employ different assessment ways to examine the same vehicle, their common goal is to confirm the vehicle's true identity. Their collective decision can ensure the accuracy and consistency of the evaluation results. Furthermore, we design different image inputs for two-stage training, which include image component separation and diversity enhancement in order to extract the ID-related prompt representation and to obtain feature representation highlighted by all experts, respectively. Extensive experimental results demonstrate that our method achieves state-of-the-art recognition performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the domain generalization problem in vehicle re - identification (ReID), that is, how to make a model well - trained on different source domains be widely adaptable to unknown target domains without additional fine - tuning or retraining. Specifically, the paper addresses the following challenges: 1. **Domain Shift Problem**: When there are significant differences in the distributions between training data and test data, the performance of the model will decline significantly. Traditional methods rely on the assumption of the consistency of the data distributions of the source domain and the target domain, which is difficult to satisfy in practical applications. 2. **Insufficient Feature Representation**: Existing domain generalization methods mainly focus on the main domain - invariant features and ignore the potentially valuable secondary features, resulting in limited generalization ability of the model. To solve these problems, the paper proposes a method named Multi - expert Knowledge Confrontation and Collaboration (MiKeCoCo). This method combines the unique perspectives of multiple experts, utilizes the Contrastive Language - Image Pretraining (CLIP) model, and makes full use of high - level semantic knowledge to achieve comprehensive feature representation. ### Specific Contributions 1. **Multi - expert Adversarial Learning**: Mining diverse prompt learning through adversarial learning in the latent space to realize the confrontation and collaboration of multi - expert knowledge. 2. **Enhanced Feature Representation**: Separating causal and non - causal factors through the spectral image enhancement strategy to increase the diversity of source - domain images, thereby improving the generalization ability of the model. 3. **Multi - expert Knowledge Fusion**: Combining main features and secondary features and using the knowledge of multiple experts for final identity verification to ensure the accuracy and consistency of the evaluation results. ### Method Overview The MiKeCoCo method includes two training stages: - **First Stage**: Obtaining learnable multi - semantic prompt representations in the visual feature space through adversarial learning, and extracting a pure identity prompt set through the Multi - expert Perspective Module (MEKA). - **Second Stage**: Using style - perturbed images to further enhance the robustness of the feature extractor and guiding the final image encoder training through multi - expert knowledge fusion. ### Experimental Results The experimental results show that the MiKeCoCo method has achieved state - of - the - art recognition performance on multiple public vehicle datasets, verifying its effectiveness and superiority. ### Conclusion This paper solves the domain generalization problem in vehicle re - identification by introducing multi - expert perspectives and adversarial learning mechanisms, and improves the generalization ability and accuracy of the model on unknown target domains.