Invariant Representation via Decoupling Style and Spurious Features from Images

Ruimeng Li,Yuanhao Pu,Zhaoyi Li,Hong Xie,Defu Lian
2024-04-01
Abstract:This paper considers the out-of-distribution (OOD) generalization problem under the setting that both style distribution shift and spurious features exist and domain labels are missing. This setting frequently arises in real-world applications and is underlooked because previous approaches mainly handle either of these two factors. The critical challenge is decoupling style and spurious features in the absence of domain labels. To address this challenge, we first propose a structural causal model (SCM) for the image generation process, which captures both style distribution shift and spurious features. The proposed SCM enables us to design a new framework called IRSS, which can gradually separate style distribution and spurious features from images by introducing adversarial neural networks and multi-environment optimization, thus achieving OOD generalization. Moreover, it does not require additional supervision (e.g., domain labels) other than the images and their corresponding labels. Experiments on benchmark datasets demonstrate that IRSS outperforms traditional OOD methods and solves the problem of Invariant risk minimization (IRM) degradation, enabling the extraction of invariant features under distribution shift.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the out - of - distribution (OOD) generalization problem in unlabeled domains (missing domain labels) in the presence of both style distribution shift and spurious features. Specifically, the paper focuses on the OOD generalization problem under the following settings: 1. **Style distribution shift**: The image styles in different domains are different, leading to changes in the data distribution. 2. **Spurious features**: There are features in the image that are irrelevant to the target class, and these features may have a negative impact on the generalization ability of the model. 3. **Unlabeled domain**: Domain labels are not provided in the training data, which increases the difficulty for the model to learn invariant features. ### Background and challenges Traditional OOD generalization methods usually deal with a single factor (such as style distribution shift or spurious features) and often require domain labels. However, in practical applications, these assumptions are often not valid because: - **Style distribution shift**: The difference in image styles in different domains may lead to a decline in the performance of the model on new domains. - **Spurious features**: Features in the image that are irrelevant to the target class may behave differently in different domains, affecting the generalization ability of the model. - **Unlabeled domain**: It is difficult to obtain accurate domain labels in practical applications, which makes methods based on domain labels difficult to apply. ### Solutions To address the above challenges, the paper proposes a new framework - **IRSS (Invariant Representation Learning via Decoupling Style and Spurious Features)**. The main contributions of IRSS include: 1. **Structural causal model (SCM)**: A new structural causal model is proposed to capture the style distribution shift and spurious features in the image generation process. 2. **Adversarial neural network**: By introducing an adversarial neural network to align the style distribution, the impact of style differences is reduced. 3. **Multi - environment optimization**: By multi - environment optimization, the impact of spurious features is eliminated, thereby achieving OOD generalization. ### Method overview 1. **Aligning style distribution**: - Extract style - specific discriminative features and use the multi - scale output of the convolutional layer as style - discriminative features. - Re - partition the style labels by clustering methods and use adversarial loss to minimize the style differences. 2. **Eliminating the impact of spurious features**: - Based on the aligned style distribution, divide samples with similar spurious features into the same environment by clustering algorithms. - Use the idea of IRM (Invariant Risk Minimization) to learn in multiple environments to achieve cross - environment invariance. 3. **Loss calculation and training**: - The final loss function consists of empirical risk minimization loss, entropy loss, adversarial loss, and IRM penalty terms. - Train the model through the composite loss function so that it has good generalization ability in different environments. ### Experimental results The paper conducted experiments on three benchmark datasets, PACS, OfficeHome, and NICO. The results show that IRSS outperforms existing OOD generalization methods on these datasets, especially achieving a significant performance improvement on the NICO dataset. ### Conclusion By introducing the structural causal model and the adversarial neural network, IRSS can effectively separate the style distribution and spurious features, thereby achieving OOD generalization in the case of unlabeled domains. This method provides new ideas for solving the OOD generalization problem in practical applications.