Taking a Closer Look at Factor Disentanglement: Dual-Path Variational Autoencoder Learning for Domain Generalization

Ying Luo,Guoliang Kang,Kexin Liu,Fuzhen Zhuang,Jinhu Lu
DOI: https://doi.org/10.1109/tmm.2023.3340552
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:Domain generalization (DG) aims to train a model with access to a limited number of source domains for generalizing it across various unseen target domains. The key to solving the DG problem is disentangling domain-invariant features ( i.e., semantic factors) from domain-specific features ( i.e., variation factors) to facilitate generalizable representation learning. Previous studies either implicitly model the semantic and variation factors or ineffectively constrain the disentangling process, thus rendering the disentanglement incomplete and ineffective. In this study, we propose a novel approach, named DualVAE , to explicitly model and disentangle both the semantic and variation factors. DualVAE is based on the variational autoencoder (VAE) architecture. However, it differs from the conventional VAE in that it consists of two paths, which explicitly model the semantic and variation factors. In addition to the reconstruction loss of VAE and the classification loss, three types of regularizations, namely statistical independence regularization, factorized prior regularization, and prediction consistency regularization, are proposed to further facilitate the disentanglement of factors. Experimental results on representative DG benchmarks show that our method performs favourably against previous state-of-the-art methods. Ablation and visualization results demonstrate that semantic and variation factors can be effectively disentangled.
What problem does this paper attempt to address?