Weighted Likelihood for Semiparametric Models and Two-phase Stratified Samples, with Application to Cox Regression

Norman E. Breslow,Jon A. Wellner
DOI: https://doi.org/10.48550/arXiv.math/0511389
2005-11-16
Abstract:Weighted likelihood, in which one solves Horvitz-Thompson or inverse probability weighted (IPW) versions of the likelihood equations, offers a simple and robust method for fitting models to two phase stratified samples. We consider semiparametric models for which solution of infinite dimensional estimating equations leads to $\sqrt{N}$ consistent and asymptotically Gaussian estimators of both Euclidean and nonparametric parameters. If the phase two sample is selected via Bernoulli (i.i.d.) sampling with known sampling probabilities, standard estimating equation theory shows that the influence function for the weighted likelihood estimator of the Euclidean parameter is the IPW version of the ordinary influence function. By proving weak convergence of the IPW empirical process, and borrowing results on weighted bootstrap empirical processes, we derive a parallel asymptotic expansion for finite population stratified sampling. Whereas the asymptotic variance for Bernoulli sampling involves the within strata second moments of the influence function, for finite population stratified sampling it involves only the within strata variances. The latter asymptotic variance also arises when the observed sampling fractions are used as estimates of those known a priori. A general procedure is proposed for fitting semiparametric models with estimated weights to two phase data. Several of our key results have already been derived for the special case of Cox regression with stratified case-cohort studies, other complex survey designs and missing data problems more generally. This paper is intended to help place this previous work in appropriate context and to pave the way for applications to other models.
Statistics Theory
What problem does this paper attempt to address?