Distributed Logistic Regression for Separated Massive Data.

Peishen Shi,Puyu Wang,Hai Zhang
DOI: https://doi.org/10.1007/978-981-15-1899-7_20
2019-01-01
Abstract:In this paper, we study the distributed logistic regression to process the separated large scale data which is stored in different linked computers. Based on the Alternating Direction Method of Multipliers (ADMM) algorithm, we transform the solving of logistic problem into the multistep iteration process, and propose the distributed logistic algorithm which has controllable communication cost. Specifically, in each iteration of the distributed algorithm, each computer updates the local estimators and interacts the local estimators with the neighbors simultaneously. Then we prove the convergence of distributed logistic algorithm. Due to the decentralized property of computer network, the proposed distributed logistic algorithm is robust. The classification results of our distributed logistic method are same as the non-distributed approach. Numerical studies have shown that our approach are both effective and efficient which perform well in distributed massive data analysis.
What problem does this paper attempt to address?