Domain-Guided Weight Modulation for Semi-Supervised Domain Generalization

Chamuditha Jayanaga Galappaththige,Zachary Izzo,Xilin He,Honglu Zhou,Muhammad Haris Khan
2024-09-04
Abstract:Unarguably, deep learning models capable of generalizing to unseen domain data while leveraging a few labels are of great practical significance due to low developmental costs. In search of this endeavor, we study the challenging problem of semi-supervised domain generalization (SSDG), where the goal is to learn a domain-generalizable model while using only a small fraction of labeled data and a relatively large fraction of unlabeled data. Domain generalization (DG) methods show subpar performance under the SSDG setting, whereas semi-supervised learning (SSL) methods demonstrate relatively better performance, however, they are considerably poor compared to the fully-supervised DG methods. Towards handling this new, but challenging problem of SSDG, we propose a novel method that can facilitate the generation of accurate pseudo-labels under various domain shifts. This is accomplished by retaining the domain-level specialism in the classifier during training corresponding to each source domain. Specifically, we first create domain-level information vectors on the fly which are then utilized to learn a domain-aware mask for modulating the classifier's weights. We provide a mathematical interpretation for the effect of this modulation procedure on both pseudo-labeling and model training. Our method is plug-and-play and can be readily applied to different SSL baselines for SSDG. Extensive experiments on six challenging datasets in two different SSDG settings show that our method provides visible gains over the various strong SSL-based SSDG baselines.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges in **Semi - Supervised Domain Generalization (SSDG)**. Specifically, the goal of SSDG is to train a model that can perform well in unseen domains with only a small amount of labeled data and a large amount of unlabeled data. ### Problem Background 1. **Domain Generalization (DG)**: Traditional DG methods assume that all data in source domains are fully labeled, aiming to train a model that can accurately predict unseen target - domain data. However, these methods perform poorly when dealing with unlabeled data. 2. **Semi - Supervised Learning (SSL)**: SSL methods show relatively good performance in the SSDG setting, but there is still a large gap compared to fully - supervised DG methods. SSL methods usually use a shared classifier to generate pseudo - labels for unlabeled data, but this method will lose domain - level expertise when facing multiple source domains with different distributions, resulting in a decrease in the accuracy of pseudo - labels and further affecting the domain generalization ability of the model. ### Core Problem of the Paper The paper points out that in the SSDG setting, as the number of source domains increases, the pseudo - label (PL) accuracy of existing SSL methods will decrease significantly (see Figure 1). This is because when dealing with data from different distributions, the shared classifier is prone to losing domain - level expertise. The decrease in pseudo - label accuracy directly affects the domain generalization ability of the model. ### Solution To solve this problem, the authors propose a new method - **Domain - Guided Weight Modulation (DGWM)**. This method improves the accuracy of pseudo - labels in the following ways: - **Domain Information Aggregation**: In each minibatch, calculate the domain information vector \(I^{(k)}\) to capture the characteristics of the current domain: \[ I^{(k)}=\frac{1}{|B_{u}^{(k)}|} \sum_{i \in B_{u}^{(k)}} f(u_{i}^{(k)}) \] - **Weight Modulation**: According to the domain information vector \(I^{(k)}\), learn a soft mask \(M^{(k)}\) to modulate the weights \(W\) of the classifier, thereby generating domain - specific classifier weights \(W^{(k)}\): \[ M^{(k)}=\sigma(G_{1}(I^{(k)}) \times G_{2}(I^{(k)})) \] where \(G_{1}\) and \(G_{2}\) are two learnable transformation functions, and \(\sigma\) is the element - wise sigmoid function. - **Noise Injection**: In the learning branch, by introducing noise into the encoder/decoder structure, the robustness of the model in learning domain information is enhanced. This helps to improve the effect of consistency learning. ### Experimental Results The paper conducted experiments on six challenging DG datasets to verify the effectiveness of the proposed method. The results show that the DGWM method significantly outperforms existing baseline methods in two different SSDG settings, especially when the amount of data is limited (such as 5 or 10 labels), and its performance improvement is particularly obvious. ### Summary This paper proposes a new solution to the SSDG problem. Through the domain - guided weight modulation method, it effectively improves the accuracy of pseudo - labels, thereby enhancing the domain generalization ability of the model. This method is not only simple and easy to use, but also achieves significant performance improvements on multiple baselines.