Communication Efficient Distributed Learning with Feature Partitioned Data

Bingwen Zhang,Jun Geng,Weiyu Xu,Lifeng Lai
DOI: https://doi.org/10.1109/ciss.2018.8362294
2018-01-01
Abstract:One major bottleneck in the design of large scale distributed machine learning algorithms is the communication cost. In this paper, we propose and analyze a distributed learning scheme for reducing the amount of communication in distributed learning problems under the feature partition scenario. The motivating observation of our scheme is that, in the existing schemes for the feature partition scenario, large amount of data exchange is needed for calculating gradients. In our proposed scheme, instead of calculating the exact gradient at each iteration, we only calculate the exact gradient sporadically. We provide precise conditions to determine when to perform the exact update, and characterize the convergence rate and bounds for total iterations and communication iterations. We further test our algorithm on real data sets and show that the proposed scheme can substantially reduce the amount of data transferred between distributed nodes.
What problem does this paper attempt to address?