Abstract:Inspired by the recent work [28] on the statistical robustness of empirical risks in reproducing kernel Hilbert space (RKHS) where the training data are potentially perturbed or even corrupted, we take a step further in this paper to investigate the statistical robustness of the kernel learning estimator (the regularized empirical risk minimizer or stationary point). We begin by deriving qualitative statistical robustness of the estimator of the regularized empirical risk minimizer for a broad class of convex cost functions when all of the training data are potentially perturbed under some topological structures, and then move on to consider the quantitative statistical robustness of the stationary solution for a specific case that the cost function is continuously differentiable but not necessarily convex. In the latter case, we derive the first-order optimality condition of the regularized expected risk minimization problem, which is essentially a stochastic variational inequality problem (SVIP) in RKHS, and then use the SVIP as a platform to investigate local and global Lipschitz continuity of the stationary solution against perturbation of the probability distribution under the Fortet-Mourier metric. A crucial assumption in the analysis is that the perturbed data are independent and identically distributed (iid). In some practical applications, this assumption may not be fulfilled when a small proportion of perceived data is seriously perturbed/contaminated. In this case, we use the influence function to investigate the impact of single data perturbation on the expected risk minimizer. Differing from [64, Chapter 10], we concentrate on constrained expected risk minimization problems. The research is essentially down to the derivation of the implicit function theorem of the SVIP in RKHS. Finally, we illustrate our theoretical analysis with a couple of academic examples.

Robust Statistics meets elicitability: When fair model validation breaks down

Robust Elicitable Functionals

Robustness Auditing for Linear Regression: To Singularity and Beyond

Robust Validation: Confident Predictions Even When Distributions Shift

Trimming Stability Selection increases variable selection robustness

Rigorous (in)validation of ecological models

Elegant robustification of sparse partial least squares by robustness-inducing transformations

Robustness Evaluation of Regression Tasks with Skewed Domain Preferences

On the Robustness of Kernel Goodness-of-Fit Tests

Statistical Robustness of Kernel Learning Estimator with Respect to Data Perturbation

An Automatic Finite-Sample Robustness Metric: When Can Dropping a Little Data Make a Big Difference?

Optimal Robust Estimation under Local and Global Corruptions: Stronger Adversary and Smaller Error

Distributionally Robust Optimization as a Scalable Framework to Characterize Extreme Value Distributions

A Robust Learning Approach for Regression Models Based on Distributionally Robust Optimization.

A Robust Learning Algorithm for Regression Models Using Distributionally Robust Optimization under the Wasserstein Metric

Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators

Statistical Properties of Robust Satisficing

Improving the Validity and Practical Usefulness of AI/ML Evaluations Using an Estimands Framework

Universal Robust Regression via Maximum Mean Discrepancy

On the KL-Divergence-based Robust Satisficing Model

Resistant Inference in Instrumental Variable Models