Abstract:Data silos, mainly caused by privacy and interoperability, significantly constrain collaborations among different organizations with similar data for the same purpose. Distributed learning based on divide-and-conquer provides a promising way to settle the data silos, but it suffers from several challenges, including autonomy, privacy guarantees, and the necessity of collaborations. This paper focuses on developing an adaptive distributed kernel ridge regression (AdaDKRR) by taking autonomy in parameter selection, privacy in communicating non-sensitive information, and the necessity of collaborations in performance improvement into account. We provide both solid theoretical verification and comprehensive experiments for AdaDKRR to demonstrate its feasibility and effectiveness. Theoretically, we prove that under some mild conditions, AdaDKRR performs similarly to running the optimal learning algorithms on the whole data, verifying the necessity of collaborations and showing that no other distributed learning scheme can essentially beat AdaDKRR under the same conditions. Numerically, we test AdaDKRR on both toy simulations and two real-world applications to show that AdaDKRR is superior to other existing distributed learning schemes. All these results show that AdaDKRR is a feasible scheme to defend against data silos, which are highly desired in numerous application regions such as intelligent decision-making, pricing forecasting, and performance prediction for products.

Effective Distributed Learning with Random Features: Improved Bounds and Algorithms

Distributed Randomized Sketching Kernel Learning

Distributed Kernel Ridge Regression with Communications.

Adaptive Distributed Kernel Ridge Regression: A Feasible Distributed Learning Scheme for Data Silos

Distributed Semi-Supervised Learning with Kernel Ridge Regression

Decentralized Kernel Ridge Regression Based on Data-Dependent Random Feature

Towards Theoretical Understanding of Learning Large-scale Dependent Data Via Random Features

Ridgeless Regression with Random Features

Lepskii Principle for Distributed Kernel Ridge Regression

Distributed Least Square Ranking with Random Features

Optimal Rates for Agnostic Distributed Learning

Towards Sharp Analysis for Distributed Learning with Random Features

Optimal Convergence Rates for Distributed Nystro<spacing diaeresis>m Approximation

Decentralised Learning with Random Features and Distributed Gradient Descent

Distributed Learning with Indefinite Kernels

Distributed Principal Component Analysis Based on Randomized Low-Rank Approximation.

Distributed Learning With Dependent Samples

Low-rank kernel regression with preserved locality for multi-class analysis

Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

Communication-Efficient Nonparametric Quantile Regression via Random Features

Coke: Communication-Censored Kernel Learning Via Random Features.