Abstract:\emph{Optimal Transport} (OT) has emerged as an important computational tool in machine learning and computer vision, providing a geometrical framework for studying probability measures. OT unfortunately suffers from the curse of dimensionality and requires regularization for practical computations, of which the \emph{entropic regularization} is a popular choice, which can be 'unbiased', resulting in a \emph{Sinkhorn divergence}. In this work, we study the convergence of estimating the 2-Sinkhorn divergence between \emph{Gaussian processes} (GPs) using their finite-dimensional marginal distributions. We show almost sure convergence of the divergence when the marginals are sampled according to some base measure. Furthermore, we show that using $n$ marginals the estimation error of the divergence scales in a dimension-free way as $\mathcal{O}\left(\epsilon^ {-1}n^{-\frac{1}{2}}\right)$, where $\epsilon$ is the magnitude of entropic regularization.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the complexity and convergence problems when using finite - dimensional marginal distributions to estimate the 2 - Sinkhorn divergence between Gaussian Processes (GPs). Specifically, the paper studies the following points: 1. **Problem background**: - Optimal Transport (OT) is an important tool in machine learning and computer vision, but it faces the curse of dimensionality problem on high - dimensional data. - To solve this problem, entropic regularization is usually used, which can lead to an unbiased Sinkhorn divergence. - Gaussian processes are infinite - dimensional stochastic processes and are widely used in machine learning tasks. Appropriate metrics or divergences need to be defined to compare them. 2. **Research objectives**: - Research how to estimate the 2 - Sinkhorn divergence between two Gaussian processes through finite - dimensional marginal distributions and analyze its convergence. - Show that when sampling marginal distributions according to certain base measures, the divergence estimate converges almost surely. - Provide the variation law of the estimation error with the number of samples $n$ and the regularization parameter $\epsilon$. 3. **Main contributions**: - The paper shows that the marginal complexity (the relationship between the error rate and the number of margins) of the entropy - regularized 2 - Wasserstein distance and the 2 - Sinkhorn divergence are $O\left(\frac{1}{\sqrt{n}}\left(\frac{1}{\epsilon}+\text{const.}\right)\right)$ and $O\left(\frac{1}{\epsilon\sqrt{n}}\right)$ respectively. - Provide concentration inequalities for empirical estimation errors. - Prove the convergence of the estimated values through experiments and observe that increasing $\epsilon$ does not necessarily improve the accuracy of the relative error, but when increasing the input dimension, a larger $\epsilon$ can reduce the relative error. 4. **Theoretical and experimental results**: - Theoretically, the paper derives the expected value of the estimation error and concentration inequalities, indicating that entropy regularization is helpful for estimating divergence. - The experimental part shows the behavior of the 2 - Sinkhorn divergence under different parameter settings, including the effects of changing the number of margins, input dimension and regularization parameter $\epsilon$. ### Summary Through strict theoretical analysis and experiments, this paper solves the problem of how to effectively use finite - dimensional marginal distributions to estimate the 2 - Sinkhorn divergence between Gaussian processes, and provides detailed insights into estimation errors and convergence. This provides a practical and effective tool for comparing Gaussian processes, especially in the case of high - dimensional data.

Estimating 2-Sinkhorn Divergence between Gaussian Processes from Finite-Dimensional Marginals

Linear Time Sinkhorn Divergences using Positive Features

Pathwise Derivatives for Multivariate Distributions

Neural Estimation Of Entropic Optimal Transport

Limit Theorems for Entropic Optimal Transport Maps and the Sinkhorn Divergence

Non-asymptotic convergence bounds for Sinkhorn iterates and their gradients: a coupling approach

A Sinkhorn-type Algorithm for Constrained Optimal Transport

Semi-Discrete Optimal Transport: Nearly Minimax Estimation With Stochastic Gradient Descent and Adaptive Entropic Regularization

On Unbalanced Optimal Transport: Gradient Methods, Sparsity and Approximation Error

Statistical Convergence Rates of Optimal Transport Map Estimation between General Distributions

The Riemannian geometry of Sinkhorn divergences

Weak limits of entropy regularized Optimal Transport; potentials, plans and divergences

Rethinking Initialization of the Sinkhorn Algorithm

Sharper Exponential Convergence Rates for Sinkhorn's Algorithm in Continuous Settings

Entropic estimation of optimal transport maps

Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps

Convergence rate of entropy-regularized multi-marginal optimal transport costs

Convergence Rates of the Regularized Optimal Transport : Disentangling Suboptimality and Entropy

Convergence of Sinkhorn's Algorithm for Entropic Martingale Optimal Transport Problem

Quantitative contraction rates for Sinkhorn algorithm: beyond bounded costs and compact marginals