Tailored inference for finite populations: conditional validity and transfer across distributions

Ying Jin,Dominik Rothenhäusler
DOI: https://doi.org/10.48550/arXiv.2104.04565
2023-03-21
Abstract:Parameters of sub-populations can be more relevant than super-population ones. For example, a healthcare provider may be interested in the effect of a treatment plan for a specific subset of their patients; policymakers may be concerned with the impact of a policy in a particular state within a given population. In these cases, the focus is on a specific finite population, as opposed to an infinite super-population. Such a population can be characterized by fixing some attributes that are intrinsic to them, leaving unexplained variations like measurement error as random. Inference for a population with fixed attributes can then be modeled as inferring parameters of a conditional distribution. Accordingly, it is desirable that confidence intervals are conditionally valid for the realized population, instead of marginalizing over many possible draws of populations. We provide a statistical inference framework for parameters of finite populations with known attributes. Leveraging the attribute information, our estimators and confidence intervals closely target a specific finite population. When the data is from the population of interest, our confidence intervals attain asymptotic conditional validity given the attributes, and are shorter than those for super-population inference. In addition, we develop procedures to infer parameters of new populations with differing covariate distributions; the confidence intervals are also conditionally valid for the new populations under mild conditions. Our methods extend to situations where the fixed information has a weaker structure or is only partially observed. We demonstrate the validity and applicability of our methods using simulated and real-world data.
Methodology
What problem does this paper attempt to address?