Network Imputation for a Spatial Autoregression Model with Incomplete Data

Zhimeng Sun,Hansheng Wang
DOI: https://doi.org/10.5705/ss.202017.0366
IF: 1.4
2020-01-01
Statistica Sinica
Abstract:Numerous imputation methods have been developed for missing data. However, these methods apply mainly to independent data, and the assumption of independence disregards connections of units through social relationships (e.g., friendship, follower relationship). In fact, observed responses from connected friends should provide valuable information for missing responses. This motivates us to conduct an imputation by borrowing information from connected friends using a network structure. With the missing assumption and using observed information only, we propose a partial likelihood approach and develop the corresponding maximum partial likelihood estimator (MPLE). The estimator's consistency and asymptotic normality are established. Using the MPLE, we then develop a novel regression imputation method. The method utilizes both auxiliary information and connected complete units (i.e., network information); using the imputed data, we can compute the sample mean of the responses. We show this method to be consistent and asymptotically normal. Compared with the imputation method using auxiliary information only (i.e., ignoring network information), the proposed estimator is statistically more efficient. Extensive simulation studies are conducted to demonstrate the finite-sample performance of the proposed method. We then analyze a real example about QQ in mainland China.
What problem does this paper attempt to address?