Variational Analysis in the Wasserstein Space

Nicolas Lanzetti,Antonio Terpin,Florian Dörfler
2024-06-15
Abstract:We study optimization problems whereby the optimization variable is a probability measure. Since the probability space is not a vector space, many classical and powerful methods for optimization (e.g., gradients) are of little help. Thus, one typically resorts to the abstract machinery of infinite-dimensional analysis or other ad-hoc methodologies, not tailored to the probability space, which however involve projections or rely on convexity-type assumptions. We believe instead that these problems call for a comprehensive methodological framework for calculus in probability spaces. In this work, we combine ideas from optimal transport, variational analysis, and Wasserstein gradient flows to equip the Wasserstein space (i.e., the space of probability measures endowed with the Wasserstein distance) with a variational structure, both by combining and extending existing results and introducing novel tools. Our theoretical analysis culminates in very general necessary optimality conditions for optimality. Notably, our conditions (i) resemble the rationales of Euclidean spaces, such as the Karush-Kuhn-Tucker and Lagrange conditions, (ii) are intuitive, informative, and easy to study, and (iii) yield closed-form solutions or can be used to design computationally attractive algorithms. We believe this framework lays the foundation for new algorithmic and theoretical advancements in the study of optimization problems in probability spaces, which we exemplify with numerous case studies and applications to machine learning, drug discovery, and distributionally robust optimization.
Optimization and Control
What problem does this paper attempt to address?
The paper attempts to address the problem of lacking a comprehensive methodological framework for optimization in the space of probability measures. Specifically, in the probability space, since it is not a vector space, many classical optimization methods (such as gradient methods) cannot be directly applied. Therefore, existing methods often rely on abstract mechanisms of infinite-dimensional analysis or other methods not specifically designed for probability spaces, and these methods usually require projections or depend on convexity assumptions. To solve this problem, the authors combine optimal transport theory, variational analysis, and the concept of Wasserstein gradient flows to provide a variational structure for the Wasserstein space (i.e., the space of probability measures equipped with the Wasserstein distance). By introducing new tools and extending existing results, they propose a comprehensive methodological framework suitable for optimization problems in probability spaces. This framework not only provides intuitive and easy-to-study necessary optimality conditions but also can derive closed-form solutions or design efficient algorithms. Furthermore, the paper explores how to extend tools from variational analysis (such as generalized subgradients, normal cones, tangent cones, etc.) to the Wasserstein space. This work lays a new theoretical and algorithmic foundation for optimization problems in fields such as machine learning, drug discovery, and distributionally robust optimization. In summary, the main goal of the paper is to provide a systematic approach to analyze and solve optimization problems in probability spaces.