Abstract:Practitioners often deploy a learned prediction model in a new environment where the joint distribution of covariate and response has shifted. In observational data, the distribution shift is often driven by unobserved confounding factors lurking in the environment, with the underlying mechanism unknown. Confounding can obfuscate the definition of the best prediction model (concept shift) and shift covariates to domains yet unseen (covariate shift). Therefore, a model maximizing prediction accuracy in the source environment could suffer a significant accuracy drop in the target environment. This motivates us to study the domain adaptation problem with observational data: given labeled covariate and response pairs from a source environment, and unlabeled covariates from a target environment, how can one predict the missing target response reliably? We root the adaptation problem in a linear structural causal model to address endogeneity and unobserved confounding. We study the necessity and benefit of leveraging exogenous, invariant covariate representations to cure concept shifts and improve target prediction. This further motivates a new representation learning method for adaptation that optimizes for a lower-dimensional linear subspace and, subsequently, a prediction model confined to that subspace. The procedure operates on a non-convex objective-that naturally interpolates between predictability and stability/invariance-constrained on the Stiefel manifold. We study the optimization landscape and prove that, when the regularization is sufficient, nearly all local optima align with an invariant linear subspace resilient to both concept and covariate shift. In terms of predictability, we show a model that uses the learned lower-dimensional subspace can incur a nearly ideal gap between target and source risk. Three real-world data sets are investigated to validate our method and theory.

Exploiting Causal Structure for Robust Model Selection in Unsupervised Domain Adaptation

Attention-based Cross-Layer Domain Alignment for Unsupervised Domain Adaptation

Transporting Causal Mechanisms for Unsupervised Domain Adaptation

On Causality in Domain Adaptation and Semi-Supervised Learning: an Information-Theoretic Analysis for Parametric Models

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation.

Causal Domain Adaptation with Copula Entropy based Conditional Independence Test

Scalable Causal Domain Adaptation

Deep causal representation learning for unsupervised domain adaptation

An objective and rapid method for the determination of light dissemination in the lens

Causally Regularized Learning with Agnostic Data Selection Bias

Learning Causal Representations for Robust Domain Adaptation

A cautious approach to constraint-based causal model selection

Learning When the Concept Shifts: Confounding, Invariance, and Dimension Reduction

Bivariate Causal Discovery using Bayesian Model Selection

Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting

Unsupervised Model Adaptation for Source-free Segmentation of Medical Images

Hidden Covariate Shift: A Minimal Assumption For Domain Adaptation

Model-Induced Generalization Error Bound for Information-Theoretic Representation Learning in Source-Data-Free Unsupervised Domain Adaptation

Maximizing conditional independence for unsupervised domain adaptation

DARING: Differentiable Causal Discovery with Residual Independence

Single-Source UDA for Privacy-Preserving Intelligent Fault Diagnosis Based on Domain augmentation