SVasP: Self-Versatility Adversarial Style Perturbation for Cross-Domain Few-Shot Learning

Wenqian Li,Pengfei Fang,Hui Xue
2024-12-12
Abstract:Cross-Domain Few-Shot Learning (CD-FSL) aims to transfer knowledge from seen source domains to unseen target domains, which is crucial for evaluating the generalization and robustness of models. Recent studies focus on utilizing visual styles to bridge the domain gap between different domains. However, the serious dilemma of gradient instability and local optimization problem occurs in those style-based CD-FSL methods. This paper addresses these issues and proposes a novel crop-global style perturbation method, called \underline{\textbf{S}}elf-\underline{\textbf{V}}ersatility \underline{\textbf{A}}dversarial \underline{\textbf{S}}tyle \underline{\textbf{P}}erturbation (\textbf{SVasP}), which enhances the gradient stability and escapes from poor sharp minima jointly. Specifically, SVasP simulates more diverse potential target domain adversarial styles via diversifying input patterns and aggregating localized crop style gradients, to serve as global style perturbation stabilizers within one image, a concept we refer to as self-versatility. Then a novel objective function is proposed to maximize visual discrepancy while maintaining semantic consistency between global, crop, and adversarial features. Having the stabilized global style perturbation in the training phase, one can obtain a flattened minima in the loss landscape, boosting the transferability of the model to the target domains. Extensive experiments on multiple benchmark datasets demonstrate that our method significantly outperforms existing state-of-the-art methods. Our codes are available at <a class="link-external link-https" href="https://github.com/liwenqianSEU/SVasP" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The problems that this paper attempts to solve are two major challenges in cross - domain few - shot learning (CD - FSL): gradient instability and local optimization problems. Specifically, the authors point out that in the existing style - based CD - FSL methods, although the gap between different domains can be bridged by changing the style of images, these methods often encounter problems of gradient instability and getting trapped in bad sharp minima during the optimization process. These problems limit the generalization ability and robustness of the model. To address these challenges, the authors propose a new framework - Self - Versatility Adversarial Style Perturbation (SVasP). The main objectives of SVasP are: 1. **Enhance gradient stability**: Stabilize the global style gradient by introducing locally cropped style gradients, thereby alleviating the problem of gradient oscillation. 2. **Escape from bad sharp minima**: Enable the model to escape from bad sharp minima during the optimization process and reach smoother and flatter minima, thereby improving the generalization ability of the model. 3. **Maximize visual differences and maintain semantic consistency**: Design a new objective function, called Discrepancy & Consistency Optimization (DCO), to maximize the visual differences between the seen and unseen domains while maintaining the semantic consistency of global and local features. Through these improvements, SVasP aims to significantly improve the model performance in cross - domain few - shot learning tasks, especially in the single - source cross - domain few - shot learning (Single Source CD - FSL) scenario, that is, when only one source - domain dataset can be accessed during training and the target - domain dataset is unavailable. Experimental results show that SVasP significantly outperforms the existing state - of - the - art methods on multiple benchmark datasets.