Toward Quality of Information Aware Distributed Machine Learning

Houping Xiao,Shiyu Wang
DOI: https://doi.org/10.1145/3522591
IF: 4.157
2022-03-15
ACM Transactions on Knowledge Discovery from Data
Abstract:In the era of big data, data are usually distributed across numerous connected computing and storage units (i.e., nodes or workers). Under such an environment, many machine learning problems can be reformulated as a consensus optimization problem, which consists of one objective and constraint terms splitting into N parts (each corresponds to a node). Such a problem can be solved efficiently in a distributed manner via Alternating Direction Method of Multipliers ( ADMM ). However, existing consensus optimization frameworks assume that every has the same quality of information (QoI) , i.e., the data from all the nodes are equally informative for the estimation of global model parameters. As a consequence, they may lead to inaccurate estimates in the presence of nodes with low QoI. To overcome this challenge, in this paper, we propose a novel consensus optimization framework for distributed machine learning that incorporates the crucial metric, quality of information. Theoretically, we prove that the convergence rate of the proposed framework is linear to the number of iterations but has a tighter upper bound compared with ADMM . Experimentally, we show that the proposed framework is more efficient and effective than existing ADMM based solutions on both synthetic and real-world datasets due to its faster convergence rate and higher accuracy.
computer science, information systems, software engineering
What problem does this paper attempt to address?