Sparsistency for Inverse Optimal Transport

Francisco Andrade,Gabriel Peyre,Clarice Poon
2024-03-09
Abstract:Optimal Transport is a useful metric to compare probability distributions and to compute a pairing given a ground cost. Its entropic regularization variant (eOT) is crucial to have fast algorithms and reflect fuzzy/noisy matchings. This work focuses on Inverse Optimal Transport (iOT), the problem of inferring the ground cost from samples drawn from a coupling that solves an eOT problem. It is a relevant problem that can be used to infer unobserved/missing links, and to obtain meaningful information about the structure of the ground cost yielding the pairing. On one side, iOT benefits from convexity, but on the other side, being ill-posed, it requires regularization to handle the sampling noise. This work presents an in-depth theoretical study of the l1 regularization to model for instance Euclidean costs with sparse interactions between features. Specifically, we derive a sufficient condition for the robust recovery of the sparsity of the ground cost that can be seen as a far reaching generalization of the Lasso's celebrated Irrepresentability Condition. To provide additional insight into this condition, we work out in detail the Gaussian case. We show that as the entropic penalty varies, the iOT problem interpolates between a graphical Lasso and a classical Lasso, thereby establishing a connection between iOT and graph estimation, an important problem in ML.
Statistics Theory
What problem does this paper attempt to address?
The paper attempts to address the problem of ground cost estimation in inverse optimal transport (iOT). Specifically: 1. **Background and Motivation**: - Optimal transport (OT) is an effective method for comparing probability distributions and can compute pairings given a ground cost. - The paper focuses on the inverse optimal transport problem (iOT), which involves inferring the ground cost from samples of the entropy-regularized optimal transport (eOT) problem. - This problem is relevant because it can be used to infer unobserved or missing links and obtain meaningful information about the structure of the ground cost. 2. **Main Challenges**: - Although the iOT problem is convex, it is ill-posed due to the influence of sampling noise and requires regularization. 3. **Research Contributions**: - The paper presents a theoretical analysis based on ℓ1 regularization to stably learn the ground cost from partially matched observations. - A sufficient condition is proposed to ensure the sparse recovery of the ground cost under noise, which can be seen as a generalization of the well-known Lasso "irrepresentable condition." - In the case of Gaussian distributions, the specific form of this condition is derived in detail, and how parameters affect the success and stability of iOT is demonstrated. - The paper explores the limiting cases under different strengths of entropy regularization, finding that iOT approaches graphical lasso with small entropy regularization and classical Lasso with large entropy regularization. In summary, the paper aims to address the ground cost estimation problem in inverse optimal transport through theoretical analysis and numerical experiments, and to explore its applications in machine learning.