Abstract:As a promising field in open-world learning, \textit{Novel Class Discovery} (NCD) is usually a task to cluster unseen novel classes in an unlabeled set based on the prior knowledge of labeled data within the same domain. However, the performance of existing NCD methods could be severely compromised when novel classes are sampled from a different distribution with the labeled ones. In this paper, we explore and establish the solvability of NCD in cross domain setting with the necessary condition that style information must be removed. Based on the theoretical analysis, we introduce an exclusive style removal module for extracting style information that is distinctive from the baseline features, thereby facilitating inference. Moreover, this module is easy to integrate with other NCD methods, acting as a plug-in to improve performance on novel classes with different distributions compared to the seen labeled set. Additionally, recognizing the non-negligible influence of different backbones and pre-training strategies on the performance of the NCD methods, we build a fair benchmark for future NCD research. Extensive experiments on three common datasets demonstrate the effectiveness of our proposed module.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the performance degradation problem encountered when performing Novel Class Discovery (NCD) in cross - domain settings. Specifically, existing NCD methods perform poorly when dealing with new classes from different distributions because these methods usually assume that the data of new classes and labeled classes come from the same domain. However, in practical applications, this assumption is often not valid, resulting in a significant performance degradation of existing methods when facing data from different distributions.
### Main problem description in the paper
1. **Limitations of existing NCD methods**:
- Existing NCD methods usually assume that the data of new classes and labeled classes come from the same domain, but in practical applications, this assumption is often not valid.
- When the new - class data and the labeled data come from different distributions, the performance of existing methods will be significantly degraded.
2. **Proposal of cross - domain NCD problem**:
- To meet this challenge, the author proposes the Cross Domain Novel Class Discovery (CDNCD) problem, that is, performing NCD when new classes and labeled classes come from different domains.
- The author verifies the failure of existing NCD methods in cross - domain settings through a series of synthetic experiments and points out the importance of solving this problem.
3. **Theoretical analysis and solution**:
- The author first theoretically analyzes the solvability of the CDNCD problem and points out that removing style information is the key to solving this problem.
- Based on this theoretical analysis, the author introduces an exclusive style removal module for extracting and removing style information, thereby improving the performance of the model in cross - domain settings.
### Specific contributions of the paper
1. **Define and verify the cross - domain NCD problem**:
- Through a series of synthetic experiments, the failure of existing NCD methods in cross - domain settings is verified, and a more challenging and practical CDNCD task is proposed.
2. **Theoretical analysis and solution**:
- For the first time, a theoretical analysis of the CDNCD problem is carried out, and a method for removing exclusive style information is proposed.
- A simple but effective exclusive style removal module is introduced, which can be integrated as a plug - in into other NCD methods to improve their performance on data from different distributions.
3. **Unified experimental framework**:
- It is pointed out that different backbone networks and pre - training strategies have a significant impact on the performance of existing algorithms. Therefore, a unified experimental coding framework is developed, providing a fair benchmark for future research.
4. **Numerical experiment verification**:
- Through numerical experiments, the effectiveness of the proposed method is quantitatively verified, and it is proved that it can be used as a plug - in for other NCD methods to improve performance.
### Conclusion
By defining and solving the cross - domain NCD problem, this paper proposes a new solution, that is, improving the performance of the model on data from different distributions by removing exclusive style information. In addition, the author also establishes a unified experimental framework, providing a reference for future research.