Exploiting Causal Structure for Robust Model Selection in Unsupervised Domain Adaptation

Trent Kyono,Mihaela van der Schaar
DOI: https://doi.org/10.1109/tai.2021.3101185
2021-12-01
IEEE Transactions on Artificial Intelligence
Abstract:In many real-world settings, such as healthcare, machine learning models are trained and validated on one labeled domain and tested or deployed on another, where feature distributions differ, i.e., there is covariate shift. When annotations are costly or prohibitive, an unsupervised domain adaptation (UDA) regime can be leveraged requiring only unlabeled samples in the target domain. Existing UDA methods are unable to factor in a models predictive loss based on predictions in the target domain and, therefore, suboptimally leverage density ratios of only the input covariates in each domain. In this article, we propose a model selection method for leveraging model predictions on a target domain without labels by exploiting the domain invariance of causal structure. We assume or learn a causal graph from the source domain and select models that produce predicted distributions in the target domain that have the highest likelihood of fitting our causal graph. We thoroughly analyze our method under oracle knowledge using synthetic data. We then show on several real-world datasets, including several COVID-19 examples, that our method is able to improve on the state-of-the-art UDA algorithms for model selection.
What problem does this paper attempt to address?