EigenVI: score-based variational inference with orthogonal function expansions

Diana Cai,Chirag Modi,Charles C. Margossian,Robert M. Gower,David M. Blei,Lawrence K. Saul
2024-10-31
Abstract:We develop EigenVI, an eigenvalue-based approach for black-box variational inference (BBVI). EigenVI constructs its variational approximations from orthogonal function expansions. For distributions over $\mathbb{R}^D$, the lowest order term in these expansions provides a Gaussian variational approximation, while higher-order terms provide a systematic way to model non-Gaussianity. These approximations are flexible enough to model complex distributions (multimodal, asymmetric), but they are simple enough that one can calculate their low-order moments and draw samples from them. EigenVI can also model other types of random variables (e.g., nonnegative, bounded) by constructing variational approximations from different families of orthogonal functions. Within these families, EigenVI computes the variational approximation that best matches the score function of the target distribution by minimizing a stochastic estimate of the Fisher divergence. Notably, this optimization reduces to solving a minimum eigenvalue problem, so that EigenVI effectively sidesteps the iterative gradient-based optimizations that are required for many other BBVI algorithms. (Gradient-based methods can be sensitive to learning rates, termination criteria, and other tunable hyperparameters.) We use EigenVI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb. On these distributions, we find that EigenVI is more accurate than existing methods for Gaussian BBVI.
Machine Learning,Computation
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve a key challenge in variational inference (VI): how to approximate complex probability distributions efficiently and accurately. Specifically, the authors propose the EigenVI method, a new black - box variational inference (BBVI) method based on eigenvalue optimization. This method uses orthogonal function expansion to construct variational approximations and optimizes these approximations by minimizing the Fisher divergence. #### Main problems: 1. **Modeling of complex distributions**: Traditional BBVI methods usually rely on the Gaussian variational family and are difficult to effectively model non - Gaussian, multimodal or asymmetric complex distributions. 2. **Optimization difficulty**: Many BBVI algorithms rely on iterative optimization methods such as gradient descent. These methods are sensitive to learning rates, termination conditions and other hyper - parameters, which easily lead to difficult parameter tuning and unstable results. 3. **Computational efficiency**: For high - dimensional data, traditional BBVI methods have high computational costs and are difficult to scale to large - scale applications. #### Main contributions of EigenVI: - **Orthogonal function expansion**: Using the expansion of orthogonal functions (such as Hermite polynomials) to construct the variational family can flexibly represent complex non - Gaussian distributions while maintaining computational tractability. - **Eigenvalue optimization**: By transforming the optimization problem into a minimum eigenvalue problem, iterative optimization methods such as gradient descent are avoided, thereby improving the stability and efficiency of optimization. - **Standard transformation**: Before applying EigenVI, by standardizing the target distribution through pre - processing (such as linear transformation), the number of required basis functions can be reduced and the computational efficiency can be improved. ### Summary of the core content of the paper: - **Construction of the variational family**: EigenVI uses the weighted sum of squares of orthogonal functions to construct the variational family, where the lowest - order term can represent the Gaussian distribution and the higher - order terms are used to capture non - Gaussian characteristics. - **Optimization method**: By minimizing the Fisher divergence, the optimization problem is transformed into a minimum eigenvalue problem, thus avoiding the complexity of gradient descent. - **Experimental verification**: The paper verifies the superior performance of EigenVI in approximating complex distributions through a series of experiments on synthetic and real - world datasets, especially outperforming the existing Gaussian BBVI methods on non - Gaussian distributions. In conclusion, EigenVI provides a new and efficient BBVI method that can handle complex probability distributions more flexibly while reducing the parameter - tuning difficulty and computational cost in the optimization process.