Abstract:Many existing approaches to generalizing statistical inference amidst distribution shift operate under the covariate shift assumption, which posits that the conditional distribution of unobserved variables given observable ones is invariant across populations. However, recent empirical investigations have demonstrated that adjusting for shift in observed variables (covariate shift) is often insufficient for generalization. In other words, covariate shift does not typically ``explain away'' the distribution shift between settings. As such, addressing the unknown yet non-negligible shift in the unobserved variables given observed ones (conditional shift) is crucial for generalizable inference. In this paper, we present a series of empirical evidence from two large-scale multi-site replication studies to support a new role of covariate shift in ``predicting'' the strength of the unknown conditional shift. Analyzing 680 studies across 65 sites, we find that even though the conditional shift is non-negligible, its strength can often be bounded by that of the observable covariate shift. However, this pattern only emerges when the two sources of shifts are quantified by our proposed standardized, ``pivotal'' measures. We then interpret this phenomenon by connecting it to similar patterns that can be theoretically derived from a random distribution shift model. Finally, we demonstrate that exploiting the predictive role of covariate shift leads to reliable and efficient uncertainty quantification for target estimates in generalization tasks with partially observed data. Overall, our empirical and theoretical analyses suggest a new way to approach the problem of distributional shift, generalizability, and external validity.

Training Classifiers under Covariate Shift by Constructing the Maximum Consistent Distribution Subset

Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift

Covariate-Shift Generalization Via Random Sample Weighting.

Algorithmic Fairness Generalization under Covariate and Dependence Shifts Simultaneously

A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization

A One-step Approach to Covariate Shift Adaptation

Adapting to Continuous Covariate Shift via Online Density Ratio Estimation

Learning Fair Invariant Representations under Covariate and Correlation Shifts Simultaneously

Harnessing the Power of Vicinity-Informed Analysis for Classification under Covariate Shift

Training-Conditional Coverage Bounds under Covariate Shift

Tolerant Algorithms for Learning with Arbitrary Covariate Shift

Transfer Learning under Covariate Shift: Local $k$-Nearest Neighbours Regression with Heavy-Tailed Design

Open-set learning under covariate shift

Beyond Reweighting: On the Predictive Role of Covariate Shift in Effect Generalization

Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

Double-Weighting for Covariate Shift Adaptation

Unleashing the Power of Graph Data Augmentation on Covariate Distribution Shift

Estimation of prediction error with known covariate shift

Selective Classification Under Distribution Shifts

Conformal Predictive Systems Under Covariate Shift

Distribution-Free Prediction Intervals Under Covariate Shift, With an Application to Causal Inference