Estimating 2-Sinkhorn Divergence between Gaussian Processes from Finite-Dimensional Marginals

Anton Mallasto
DOI: https://doi.org/10.48550/arXiv.2102.03267
2021-02-06
Abstract:\emph{Optimal Transport} (OT) has emerged as an important computational tool in machine learning and computer vision, providing a geometrical framework for studying probability measures. OT unfortunately suffers from the curse of dimensionality and requires regularization for practical computations, of which the \emph{entropic regularization} is a popular choice, which can be 'unbiased', resulting in a \emph{Sinkhorn divergence}. In this work, we study the convergence of estimating the 2-Sinkhorn divergence between \emph{Gaussian processes} (GPs) using their finite-dimensional marginal distributions. We show almost sure convergence of the divergence when the marginals are sampled according to some base measure. Furthermore, we show that using $n$ marginals the estimation error of the divergence scales in a dimension-free way as $\mathcal{O}\left(\epsilon^ {-1}n^{-\frac{1}{2}}\right)$, where $\epsilon$ is the magnitude of entropic regularization.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the complexity and convergence problems when using finite - dimensional marginal distributions to estimate the 2 - Sinkhorn divergence between Gaussian Processes (GPs). Specifically, the paper studies the following points: 1. **Problem background**: - Optimal Transport (OT) is an important tool in machine learning and computer vision, but it faces the curse of dimensionality problem on high - dimensional data. - To solve this problem, entropic regularization is usually used, which can lead to an unbiased Sinkhorn divergence. - Gaussian processes are infinite - dimensional stochastic processes and are widely used in machine learning tasks. Appropriate metrics or divergences need to be defined to compare them. 2. **Research objectives**: - Research how to estimate the 2 - Sinkhorn divergence between two Gaussian processes through finite - dimensional marginal distributions and analyze its convergence. - Show that when sampling marginal distributions according to certain base measures, the divergence estimate converges almost surely. - Provide the variation law of the estimation error with the number of samples \(n\) and the regularization parameter \(\epsilon\). 3. **Main contributions**: - The paper shows that the marginal complexity (the relationship between the error rate and the number of margins) of the entropy - regularized 2 - Wasserstein distance and the 2 - Sinkhorn divergence are \(O\left(\frac{1}{\sqrt{n}}\left(\frac{1}{\epsilon}+\text{const.}\right)\right)\) and \(O\left(\frac{1}{\epsilon\sqrt{n}}\right)\) respectively. - Provide concentration inequalities for empirical estimation errors. - Prove the convergence of the estimated values through experiments and observe that increasing \(\epsilon\) does not necessarily improve the accuracy of the relative error, but when increasing the input dimension, a larger \(\epsilon\) can reduce the relative error. 4. **Theoretical and experimental results**: - Theoretically, the paper derives the expected value of the estimation error and concentration inequalities, indicating that entropy regularization is helpful for estimating divergence. - The experimental part shows the behavior of the 2 - Sinkhorn divergence under different parameter settings, including the effects of changing the number of margins, input dimension and regularization parameter \(\epsilon\). ### Summary Through strict theoretical analysis and experiments, this paper solves the problem of how to effectively use finite - dimensional marginal distributions to estimate the 2 - Sinkhorn divergence between Gaussian processes, and provides detailed insights into estimation errors and convergence. This provides a practical and effective tool for comparing Gaussian processes, especially in the case of high - dimensional data.