Regression Analysis with Individual-Specific Patterns of Missing Covariates

Huazhen Lin,Wei Liu,Wei Lan
DOI: https://doi.org/10.1080/07350015.2019.1635486
2019-01-01
Journal of Business and Economic Statistics
Abstract:Abstract It is increasingly common to collect data from heterogeneous sources in practice. Two major challenges complicate the statistical analysis of such data. First, only a small proportion of units have complete information across all sources. Second, the missing data patterns vary across individuals. Our motivating online-loan data have 93% missing covariates where the missing pattern is individual-specific. The existing regression analysis with missing covariates either are inefficient or require additional modeling assumptions on the covariates. We propose a simple yet efficient iterative least squares estimator of the regression coefficient for the data with individual-specific missing patterns. Our method has several desirable features. First, it does not require any modeling assumptions on the covariates. Second, the imputation of the missing covariates involves feasible one-dimensional nonparametric regressions, and can maximally use the information across units and the relationship among the covariates. Third, the iterative least squares estimate is both computationally and statistically efficient. We study the asymptotic properties of our estimator and apply it to the motivating online-loan data. Supplementary materials for this article are available online. KEY WORDS: High missing rate; Individual-specific missing; Iterative least squares; Missing covariates.
What problem does this paper attempt to address?