Abstract:Random feature (RF) has been widely used for node consistency in decentralized kernel ridge regression (KRR). Currently, the consistency is guaranteed by imposing constraints on coefficients of features, necessitating that the random features on different nodes are identical. However, in many applications, data on different nodes varies significantly on the number or distribution, which calls for adaptive and data-dependent methods that generate different RFs. To tackle the essential difficulty, we propose a new decentralized KRR algorithm that pursues consensus on decision functions, which allows great flexibility and well adapts data on nodes. The convergence is rigorously given and the effectiveness is numerically verified: by capturing the characteristics of the data on each node, while maintaining the same communication costs as other methods, we achieved an average regression accuracy improvement of 25.5\% across six real-world data sets.

What problem does this paper attempt to address?

### Problems the paper attempts to solve This paper aims to solve the adaptability and flexibility problems caused by significant differences in data among nodes in Distributed Kernel Ridge Regression (DKRR). Specifically: 1. **Limitations of existing methods**: - Current DKRR algorithms usually ensure node consistency by imposing consistency constraints on feature coefficients on different nodes. This means that the Random Features (RF) on different nodes must be the same. - However, in practical applications, the amount and distribution of data on different nodes may vary greatly, which requires the algorithm to be able to adaptively generate different RFs. 2. **The proposed new method**: - To overcome the above difficulties, the paper proposes a new Decentralized Kernel Ridge Regression with Data - Dependent Random Features (DeKRR - DDRF) algorithm, which pursues the consistency of the decision function rather than the consistency of feature coefficients. - This method allows different nodes to use different RFs, thus improving the flexibility and adaptability of the algorithm. 3. **Main contributions**: - **Algorithm innovation**: The DeKRR - DDRF algorithm is proposed, allowing each node to use inconsistent features, so that data - dependent RF techniques can be applied to improve accuracy and efficiency. - **Solution method**: A fast and communication - efficient solution method is provided. The analytical solution can be directly obtained at each iteration step, and the convergence of the solution method is proved. - **Experimental verification**: The performance of the algorithm on non - independent and identically distributed (non - IID) and unbalanced data sets is verified through multiple numerical experiments. The results show that this algorithm is superior to other methods. ### Specific problem description - **Problem background**: - Kernel methods are widely used to deal with complex nonlinear problems, transforming data from the sample space to a high - dimensional space through nonlinear mapping. - The decentralized learning framework has advantages in large - scale data processing, but it needs to solve the problems of data privacy protection and communication cost. - **Problems of existing methods**: - Existing decentralized KRR algorithms rely on the consistency constraint of feature coefficients, which is unreasonable and difficult to achieve in the case of uneven data distribution. - Although the random feature method can protect data privacy, using the same RF on different nodes limits the adaptability of the algorithm. - **Solution**: - The proposed DeKRR - DDRF algorithm allows different nodes to use different RFs by pursuing the consistency of the decision function. - By introducing the data - dependent RF method, the regression accuracy can be improved while keeping the communication cost unchanged. ### Experimental results - **Experimental setup**: - Six real - world data sets are used for experiments, including housing price, air quality, energy consumption, Twitter sentiment analysis, Tom's hardware review, and waveform data. - The experimental evaluation index is Relative Square Error (RSE). - **Experimental results**: - On non - IID and unbalanced data sets, the performance of the DeKRR - DDRF algorithm is significantly better than that of the existing DKRR and DKLA methods. - Especially in cases where the amount of data, data distribution, and noise intensity are different, the DeKRR - DDRF algorithm shows higher flexibility and accuracy. ### Conclusion This paper solves the problems of data adaptability and flexibility in decentralized kernel ridge regression by proposing the DeKRR - DDRF algorithm, providing an effective solution for large - scale data processing in practical applications.

Decentralized Kernel Ridge Regression Based on Data-Dependent Random Feature

Self-representative kernel concept factorization

Kernel Ridge Regression Inference

Low-rank kernel regression with preserved locality for multi-class analysis

Optimal Kernel Quantile Learning with Random Features

Lepskii Principle for Distributed Kernel Ridge Regression

A Decentralized Framework for Kernel PCA with Projection Consensus Constraints

Decentralised Learning with Random Features and Distributed Gradient Descent

Kernel Truncated Randomized Ridge Regression: Optimal Rates and Low Noise Acceleration

Distributed Learning with Indefinite Kernels

A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression

Solving Kernel Ridge Regression with Gradient-Based Optimization Methods

Stein Random Feature Regression

A Duality Analysis of Kernel Ridge Regression in the Noiseless Regime

Universality of kernel random matrices and kernel regression in the quadratic regime

Efficient multiple incremental computation for Kernel Ridge Regression with Bayesian uncertainty modeling

Robust, randomized preconditioning for kernel ridge regression

Random Forest (RF) Kernel for Regression, Classification and Survival

Communication-Efficient Nonparametric Quantile Regression via Random Features

RFFNet: Large-Scale Interpretable Kernel Methods via Random Fourier Features

Improved convergence rates for some kernel random forest algorithms