Approximate Marginal Likelihood Inference in Mixed Models for Grouped Data

Alex Stringer
DOI: https://doi.org/10.48550/arXiv.2310.01589
2024-11-12
Abstract:A method is introduced for approximate marginal likelihood inference via adaptive Gaussian quadrature in mixed models with a single grouping factor. The core technical contribution is an algorithm for computing the exact gradient of the approximate log marginal likelihood. This leads to efficient maximum likelihood via quasi-Newton optimization that is demonstrated to be faster than existing approaches based on finite-differenced gradients or derivative-free optimization. The method is specialized to Bernoulli mixed models with multivariate, correlated Gaussian random effects; here computations are performed using an inverse log-Cholesky parameterization of the Gaussian density that involves no matrix decomposition during model fitting, while Wald confidence intervals are provided for variance parameters on the original scale. Simulations give evidence of these intervals attaining nominal coverage if enough quadrature points are used, for data comprised of a large number of very small groups exhibiting large between-group heterogeneity. In contrast, the Laplace approximation is shown to give especially poor coverage and high bias for data comprised of a large number of small groups. Adaptive quadrature mitigates this, and the methods in this paper improve the computational feasibility of this more accurate method. All results may be reproduced using code available at \url{<a class="link-external link-https" href="https://github.com/awstringer1/aghmm-paper-code" rel="external noopener nofollow">this https URL</a>}.
Methodology,Computation
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to perform accurate and efficient marginal likelihood inference in the mixture model. Specifically, the paper focuses on the mixture model of grouped data with a single grouping factor. The main contribution of the paper lies in proposing a method to approximately calculate the marginal likelihood through Adaptive Gaussian Quadrature (AGQ), and specifically developing an algorithm for calculating the gradient of this approximate log - marginal likelihood. This method can significantly improve the efficiency of maximum - likelihood estimation, especially when dealing with binary mixture models, reducing the computation time by 2 to 4 times compared with existing methods based on finite - difference gradients or derivative - free optimization. ### Core Problems of the Paper 1. **Calculation of Marginal Likelihood**: In the mixture model, due to the existence of latent variables, it is not feasible to directly use the joint likelihood for inference. Therefore, it is necessary to calculate the marginal likelihood, that is, the integral of the joint likelihood with respect to the latent variables. This integral is usually not analytically tractable and requires approximation methods. 2. **High - Precision Approximation Method**: As the amount of data increases, the integration error of each group accumulates, leading to a decline in the overall inference quality. Therefore, an approximation method that can maintain high precision as the amount of data increases is required. 3. **Efficient Optimization Method**: In order to maximize the approximate marginal likelihood, efficient numerical optimization methods are required. The paper proposes a method using Adaptive Gaussian Quadrature combined with Quasi - Newton Optimization, and in particular, develops an algorithm for calculating the gradient, thereby achieving efficient parameter estimation. ### Main Contributions 1. **Accurate Gradient Calculation**: The paper develops an algorithm that can accurately calculate the gradient of the Adaptive Gaussian Quadrature approximate log - marginal likelihood. This makes it possible to use the Quasi - Newton optimization method, improving the efficiency and stability of optimization. 2. **Efficient Optimization**: By using accurate gradient information, the method in the paper is significantly superior to existing methods based on finite - difference gradients or derivative - free optimization in terms of computation time. 3. **Application Examples**: The paper verifies the effectiveness of the method through simulation and real - data examples, especially when dealing with data with a large number of small groups, it performs particularly well. ### Formula Examples Some key formulas involved in the paper are as follows: - **Adaptive Gaussian Quadrature Approximate Marginal Likelihood**: \[ \hat{\pi}_{\text{AQ}}^k(\theta; y)=\prod_{i = 1}^m\left[|bL_i(\theta, bu_i(\theta))|^{-1}\sum_{z\in Q(d, k)}\omega_k(z)\pi_i\left(\theta, bL_i(\theta, bu_i(\theta))^{-1}z + bu_i(\theta); y_i\right)\right] \] - **Laplace Approximation**: \[ \hat{\pi}_{\text{AQ}}^1(\theta; y)=(2\pi)^{(dm)/2}|bL(\theta)|^{-1}\pi(\theta, bu(\theta); y) \] - **Gradient Calculation**: \[ \nabla_\theta\ell_k(\theta)=\sum_{i = 1}^n\nabla_\theta\ell_i^k(\theta) \] where \[ \nabla_\theta\ell_i^k(\theta)=\left.\frac{\partial\ell_i^k(\theta, u, bL_i(\theta, u))}{\partial\theta}\right|_{u = bu_i(\theta)} \] These formulas show the technical details of the method in the paper, especially the innovations in marginal likelihood approximation and gradient calculation.