Distributed quantile regression for massive heterogeneous data

Aijun Hu,Yuling Jiao,Yanyan Liu,Yueyong Shi,Yuanshan Wu
DOI: https://doi.org/10.1016/j.neucom.2021.03.041
IF: 6
2021-01-01
Neurocomputing
Abstract:Massive data sets pose great challenges to data analysis because of their heterogeneous data structure and limited computer memory. Jordan et al. (2019, Journal of American Statistical Association) has proposed a communication-efficient surrogate likelihood (CSL) method to solve distributed learning problems. However, their method cannot be directly applied to quantile regression because the loss function in quantile regression does not meet the smoothness requirement in CSL method. In this paper, we extend CSL method so that it is applicable to quantile regression problems. The key idea is to construct a surrogate loss function which relates to the local data only through subgradients of the loss function. The alternating direction method of multipliers (ADMM) algorithm is used to address computational issues caused by the non-smooth loss function. Our theoretical analysis establishes the consistency and asymptotic normality for the proposed method. Simulation studies and applications to real data show that our method works well. (c) 2021 Elsevier B.V. All rights reserved.
What problem does this paper attempt to address?