Likelihood approximations via Gaussian approximate inference

Thang D. Bui
2024-10-28
Abstract:Non-Gaussian likelihoods are essential for modelling complex real-world observations but pose significant computational challenges in learning and inference. Even with Gaussian priors, non-Gaussian likelihoods often lead to analytically intractable posteriors, necessitating approximation methods. To this end, we propose efficient schemes to approximate the effects of non-Gaussian likelihoods by Gaussian densities based on variational inference and moment matching in transformed bases. These enable efficient inference strategies originally designed for models with a Gaussian likelihood to be deployed. Our empirical results demonstrate that the proposed matching strategies attain good approximation quality for binary and multiclass classification in large-scale point-estimate and distributional inferential settings. In challenging streaming problems, the proposed methods outperform all existing likelihood approximations and approximate inference methods in the exact models. As a by-product, we show that the proposed approximate log-likelihoods are a superior alternative to least-squares on raw labels for neural network classification.
Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to approximate the influence of non - Gaussian likelihood functions with Gaussian distribution efficiently and accurately, thereby simplifying the modeling and inference process of complex real - world observational data**. Specifically, non - Gaussian likelihood functions play a crucial role in handling complex real - world observational data (such as classification or counting), but they pose significant computational challenges during the learning and inference processes. Even for linear models with Gaussian priors, non - Gaussian likelihood functions usually lead to analytically intractable posterior distributions. To solve this problem, the author proposes a new method, that is, through variational inference and moment matching in the transformed basis, the influence of non - Gaussian likelihood functions is approximated as a Gaussian distribution efficiently and accurately. This enables the effective inference strategies originally designed for Gaussian - likelihood models to be applied to a wider range of models. ### Main Contributions 1. **Efficient Approximation Method**: A method based on variational inference and moment matching in the transformed basis is proposed, which can efficiently approximate the influence of non - Gaussian likelihood functions. 2. **Wide Applicability**: These methods are not only applicable to binary - classification and multi - classification tasks, but also can perform well in large - scale point - estimation and distribution - inference settings. 3. **Superior Performance in Online Learning**: In the challenging streaming - data environment, the proposed approximation method outperforms all existing likelihood - approximation and approximate - inference methods. 4. **Alternative to Least Squares Method**: It is proved that the proposed approximate log - likelihood function is a superior alternative to the original - label least - squares method in neural - network classification. ### Method Overview - **Exponential - Family - Density Approximation in Transformed Basis**: By choosing an appropriate transformation, the density in the new basis exhibits a bell - shaped curve close to the Gaussian distribution, so that it can be approximated using the Gaussian distribution. - **Variational Matching**: By minimizing the variational objective function \(F(q)=KL(q(\psi)\|p_{\psi}(\psi))\), the optimal mean and covariance are found. - **Moment Matching**: By matching the first - and second - order moments of the Gaussian distribution with the original distribution, an approximate solution is obtained. ### Experimental Verification The paper verifies the effectiveness of the proposed method through a series of experiments, including: - Performance evaluation on standard neural - network - classification benchmarks. - Application in Bayesian non - parametric settings (such as Gaussian - process multi - classification). - Performance in the online - learning scenario of streaming data. In conclusion, this paper proposes a novel and efficient method to approximate non - Gaussian likelihood functions, thereby simplifying the modeling and inference process of complex data and demonstrating superior performance in multiple tasks and settings.