Abstract:The quick evolution and widespread applicability of machine learning and artificial intelligence have fundamentally shaped and transcended modern life. Three key players stand behind such a ubiquitous emergence: big data, growing computing power, and improved algorithms. The need for distributed storage and processing arises from this ``data deluge'' that can flood any powerful machine, from the dispersively available data that are prohibitively costly to transfer to a central unit for further processing, and from the prevalent Internet-of-Things (IoT) devices that require real-time response as well as respect of privacy. Modern machine learning algorithms built to exploit such huge amounts of data are often computationally ``hungry'' and their ``appetite'' for computing power increases rapidly at a pace unmatched by the development of computing hardware. All these considerations justify the pressing need for distributed optimization algorithms that are scalable yet flexible to adapt to various configurations of networked computing nodes. To cope with these challenges, the present thesis first introduces a novel ADMM based approach (termed hybrid ADMM) for efficient decentralized optimization. By modeling the underlying communication patterns as hypergraphs, it provides a unifying framework that subsumes both centralized and fully decentralized counterparts as special cases, and allows nodes to communicate in centralized and decentralized ways at the same time. Leveraging the expressiveness of hypergraph models, a technique termed ``in-network acceleration'' is introduced enabling ``almost free'' performance gain by exploiting local graph topology. To account for heterogeneity of nodes and edges, a diagonal scaling based approach is proposed to tackle weighted updates, where proper edge weights are identified through solving a preconditioning problem. By assigning larger weights to critical edges, the proposed algorithm achieves higher efficiency and becomes more robust to perturbations. Finally, to boost the efficiency of the whole system to its full potential, an asynchronous method is introduced to mitigate the straggler problem so that nodes with different processing power can run at full speed. Convergence analysis for proposed algorithms is provided which reveals the connection between convergence rate and spectral properties of the communication graph. Numerical tests of several common tasks on different graphs are carried out to demonstrate the effectiveness of proposed algorithms.

Toward Quality of Information Aware Distributed Machine Learning

Coded Stochastic ADMM for Decentralized Consensus Optimization With Edge Computing

A Block-wise, Asynchronous and Distributed ADMM Algorithm for General Form Consensus Optimization.

ADMM-Tracking Gradient for Distributed Optimization over Asynchronous and Unreliable Networks

Asynchronous Distributed Optimization via ADMM with Efficient Communication

Efficient Methods for Decentralized Optimization over Graphs

Distributed Algorithms for Composite Optimization: Unified Framework and Convergence Analysis

Distributed ADMM for Time-Varying Communication Networks

Decentralized Optimization with Distributed Features and Non-Smooth Objective Functions

Adaptive Uncertainty-Weighted ADMM for Distributed Optimization

A General Distributed Dual Coordinate Optimization Framework for Regularized Loss Minimization

Graph-aware Weighted Hybrid ADMM for Fast Decentralized Optimization

Sublinear and Linear Convergence of Modified ADMM for Distributed Nonconvex Optimization

A Generalized Alternating Direction Implicit Method for Consensus Optimization: Application to Distributed Sparse Logistic Regression

Distributed adaptive Newton methods with global superlinear convergence

Gradient-tracking based Distributed Optimization with Guaranteed Optimality under Noisy Information Sharing

Proximal ADMM for Nonconvex and Nonsmooth Optimization

Distributed Adaptive Newton Methods with Globally Superlinear Convergence

Achieving Globally Superlinear Convergence for Distributed Optimization with Adaptive Newton Method

Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization

Limited Communications Distributed Optimization via Deep Unfolded Distributed ADMM