Communication-efficient Estimation of Quantile Matrix Regression for Massive Datasets

Yaohong Yang,Lei Wang,Jiamin Liu,Rui Li,Heng Lian
DOI: https://doi.org/10.1016/j.csda.2023.107812
IF: 2.035
2023-01-01
Computational Statistics & Data Analysis
Abstract:In modern scientific applications, more and more data sets contain natural matrix predictors and traditional regression methods are not directly applicable. Matrix regression has been adapted to such data structure and received increasing attention in recent years. In this paper, we consider estimation of the conditional quantile in high-dimensional regularized matrix regression with a nuclear norm penalty and establish the convergence rate of the estimator. In order to construct a quantile matrix regression estimator in the distributed setting or for massive data sets, we propose a regularized communication-efficient surrogate loss (CSL) function. The proposed CSL method only needs the worker machines to compute the gradient based on local data and the central machine solves a regularized estimation problem. We prove that the estimation error based on the proposed CSL method matches the estimation error bound of the centralized method that analyzes the entire data set. An alternating direction method of multipliers algorithm is developed to efficiently obtain the distributed CSL estimator. The finite-sample performance of the proposed estimator is studied through simulations and an application to Beijing Air Quality data set.
What problem does this paper attempt to address?