Abstract:Sampling a target probability distribution with an unknown normalization constant is a fundamental challenge in computational science and engineering. Recent work shows that algorithms derived by considering gradient flows in the space of probability measures open up new avenues for algorithm development. This paper makes three contributions to this sampling approach by scrutinizing the design components of such gradient flows. Any instantiation of a gradient flow for sampling needs an energy functional and a metric to determine the flow, as well as numerical approximations of the flow to derive algorithms. Our first contribution is to show that the Kullback-Leibler divergence, as an energy functional, has the unique property (among all f-divergences) that gradient flows resulting from it do not depend on the normalization constant of the target distribution. Our second contribution is to study the choice of metric from the perspective of invariance. The Fisher-Rao metric is known as the unique choice (up to scaling) that is diffeomorphism invariant. As a computationally tractable alternative, we introduce a relaxed, affine invariance property for the metrics and gradient flows. In particular, we construct various affine invariant Wasserstein and Stein gradient flows. Affine invariant gradient flows are shown to behave more favorably than their non-affine-invariant counterparts when sampling highly anisotropic distributions, in theory and by using particle methods. Our third contribution is to study, and develop efficient algorithms based on Gaussian approximations of the gradient flows; this leads to an alternative to particle methods. We establish connections between various Gaussian approximate gradient flows, discuss their relation to gradient methods arising from parametric variational inference, and study their convergence properties both theoretically and numerically.

Dimension reduction and the gradient flow of relative entropy

Noncommutative Model Selection for Data Clustering and Dimension Reduction Using Relative von Neumann Entropy

Discretized Gradient Flow for Manifold Learning in the Space of Embeddings

On Probabilistic Embeddings in Optimal Dimension Reduction

Gradient Flows of Generalized Relative Entropy and Functional Inequalities on Graphs

Independent projections of diffusions: Gradient flows for variational inference and optimal mean field approximations

Gradient-based explanation for non-linear non-parametric dimensionality reduction

Projected Langevin dynamics and a gradient flow for entropic optimal transport

Gradient flows on metric graphs with reservoirs: Microscopic derivation and multiscale limits

Dimension Reduction for Fréchet Regression

Dimensional reduction of gradient-like stochastic systems with multiplicative noise via Fokker-Planck diffusion maps

Theoretical Foundations of t-SNE for Visualizing High-Dimensional Clustered Data

Rearranged Stochastic Heat Equation: Ergodicity and Related Gradient Descent on ${\mathcal P}({\mathbb R})$

Fisher-Rao Gradient Flow: Geodesic Convexity and Functional Inequalities

t-SNE, Forceful Colorings and Mean Field Limits

Metric mean dimension of flows

The Dimension-Reduction Strategy Via Mapping for Probability Density Evolution Analysis of Nonlinear Stochastic Systems

Convergence analysis of t-SNE as a gradient flow for point cloud on a manifold

Manifold Diffusion Geometry: Curvature, Tangent Spaces, and Dimension

Sampling via Gradient Flows in the Space of Probability Measures

Metric Flows with Neural Networks