Optimal subsampling algorithm for the marginal model with large longitudinal data

Haohui Han,Liya Fu
DOI: https://doi.org/10.48550/arXiv.2311.08812
2023-11-15
Abstract:Big data is ubiquitous in practices, and it has also led to heavy computation burden. To reduce the calculation cost and ensure the effectiveness of parameter estimators, an optimal subset sampling method is proposed to estimate the parameters in marginal models with massive longitudinal data. The optimal subsampling probabilities are derived, and the corresponding asymptotic properties are established to ensure the consistency and asymptotic normality of the estimator. Extensive simulation studies are carried out to evaluate the performance of the proposed method for continuous, binary and count data and with four different working correlation matrices. A depression data is used to illustrate the proposed method.
Methodology
What problem does this paper attempt to address?