A fast divide-and-conquer strategy for single-index model with massive data

Na Li,Jing Yang
DOI: https://doi.org/10.1007/s00180-024-01562-6
IF: 1.4049
2024-09-30
Computational Statistics
Abstract:With the rapid development of modern technology, massive data has received widespread attention. Constrained by computer performance, traditional statistical analysis methods are difficult to obtain quantitative analysis results of massive data. Currently, the most popular analytical method for massive data is the divide-and-conquer(DC) strategy, but relevant studies rarely mention the fitting of semiparametric models under massive data. In this paper, we combine the ideas of DC and refined outer product gradient (rOPG) to propose DC-lsrOPG and DC-qrOPG methods for analyzing massive data for single-index models, based on least squares regression and quantile regression, respectively. The newly developed method significantly reduces the amount of main memory required and running time. The asymptotic normality of the proposed method has been established under some mild conditions. The resulting estimators are theoretically as efficient as the traditional rOPG estimators on the entire data. Some simulation studies and a real data analysis are conducted to illustrate the finite sample performance of the proposed methods.
statistics & probability
What problem does this paper attempt to address?