A GMM Approach in Coupling Internal Data and External Summary Information with Heterogeneous Data Populations

Jun Shao,Jinyi Wang,Lei Wang
DOI: https://doi.org/10.1007/s11425-022-2111-0
2024-01-01
Abstract:Because of advances in data collection and storage, statistical analysis in modern scientific research and practice now has opportunities to utilize external information such as summary statistics from similar studies. A likelihood approach based on a parametric model assumption has been developed in the literature to utilize external summary information when the populations for the external data and main internal data are assumed to be the same. In this article, we instead consider the generalized estimation equation (GEE) approach for statistical inference, which is semiparametric or nonparametric, and show how to utilize external summary information even when internal and external data populations are not the same. Our approach is coupling the internal data and external summary information to form additional estimation equations and then applying the generalized method of moments (GMM). We show that the proposed GMM estimator is asymptotically normal and, under some conditions, is more efficient than the GEE estimator without using external summary information. Estimators of the asymptotic covariance matrix of the GMM estimators are also proposed. Simulation results are obtained to confirm our theory and quantify the improvements by utilizing external data. An example is also included for illustration.
What problem does this paper attempt to address?