TDG: Text-guided Domain Generalization

Geng Liu,Yuxi Wang
2023-08-19
Abstract:Domain generalization (DG) attempts to generalize a model trained on single or multiple source domains to the unseen target domain. Benefiting from the success of Visual-and-Language Pre-trained models in recent years, we argue that it is crucial for domain generalization by introducing extra text information. In this paper, we develop a novel Text-guided Domain Generalization (TDG) paradigm for domain generalization, which includes three following aspects. Specifically, we first devise an automatic words generation method to extend the description of current domains with novel domain-relevant words. Then, we embed the generated domain information into the text feature space, by the proposed prompt learning-based text feature generation method, which shares a common representation space with the image feature. Finally, we utilize both input image features and generated text features to train a specially designed classifier that generalizes well on unseen target domains, while the image encoder is also updated under the supervision of gradients back propagated from the classifier. Our experimental results show that the techniques incorporated by TDG contribute to the performance in an easy implementation manner. Experimental results on several domain generalization benchmarks show that our proposed framework achieves superior performance by effectively leveraging generated text information in domain generalization.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily aims to address the issue of insufficient model generalization capability in Domain Generalization (DG). Specifically, the researchers attempt to solve the following problems: 1. **Background and Challenges**: Machine learning models perform well when the training data and test data have similar distributions. However, in practice, test data often come from different scenarios, leading to the so-called "domain shift." To overcome this shift, researchers have proposed domain generalization methods, which aim to enable models to perform well in unseen target domains. 2. **Limitations of Existing Methods**: Traditional domain generalization methods mainly focus on learning generalization capabilities from images in the source domain. However, these methods have limited effectiveness and struggle to significantly surpass the simple Empirical Risk Minimization (ERM) baseline. 3. **Proposed Solution in the Paper**: The authors propose a new method—Text-guided Domain Generalization (TDG), which enhances the model's generalization capability by leveraging additional text information. This method first generates domain-related vocabulary, then embeds these words into the text feature space through Prompt Learning, and finally uses the generated text features along with image features to train a specially designed classifier to improve the model's adaptability to unseen target domains. In summary, the core objective of this paper is to enhance the model's ability to handle unknown domain data in domain generalization tasks by introducing text information, especially when dealing with significant distribution gaps. In this way, the TDG method can effectively utilize domain knowledge from text information to improve the model's generalization performance.