Sampling from the Mean-Field Stationary Distribution

Yunbum Kook,Matthew S. Zhang,Sinho Chewi,Murat A. Erdogdu,Mufan Bill Li
2024-07-05
Abstract:We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term. Our main insight is to decouple the two key aspects of this problem: (1) approximation of the mean-field SDE via a finite-particle system, via uniform-in-time propagation of chaos, and (2) sampling from the finite-particle stationary distribution, via standard log-concave samplers. Our approach is conceptually simpler and its flexibility allows for incorporating the state-of-the-art for both algorithms and theory. This leads to improved guarantees in numerous settings, including better guarantees for optimizing certain two-layer neural networks in the mean-field regime. A key technical contribution is to establish a new uniform-in-$N$ log-Sobolev inequality for the stationary distribution of the mean-field Langevin dynamics.
Statistics Theory,Machine Learning
What problem does this paper attempt to address?
This paper attempts to address the complexity of sampling from the stationary distribution of mean - field stochastic differential equations (mean - field SDEs). Specifically, the paper focuses on how to effectively minimize a functional of functions on a probability measure space that includes interaction terms, which has important applications in areas such as the analysis of neural network training dynamics. The main insight of the authors is to decouple two key aspects of this problem: (1) approximate the mean - field SDE by a finite - particle system, taking advantage of the chaos propagation of time - consistency; (2) sample from the stationary distribution of the finite - particle system through a standard log - concave sampler. This method is conceptually simpler, and its flexibility allows for the incorporation of the latest algorithms and techniques, thus providing improved guarantees in multiple settings, especially in optimizing the mean - field mechanism of certain two - layer neural networks. The key technical contribution of the paper lies in establishing new logarithmic Sobolev inequalities regarding the time - consistency of the stationary distribution of mean - field Langevin dynamics. This contribution not only enhances the theoretical understanding but also provides a solid theoretical foundation for algorithm design in practical applications.