Abstract:The Fisher information matrix (FIM) is a key quantity in statistics as it is required for example for evaluating asymptotic precisions of parameter estimates, for computing test statistics or asymptotic distributions in statistical testing, for evaluating post model selection inference results or optimality criteria in experimental designs. However its exact computation is often not trivial. In particular in many latent variable models, it is intricated due to the presence of unobserved variables. Therefore the observed FIM is usually considered in this context to estimate the FIM. Several methods have been proposed to approximate the observed FIM when it can not be evaluated analytically. Among the most frequently used approaches are Monte-Carlo methods or iterative algorithms derived from the missing information principle. All these methods require to compute second derivatives of the complete data log-likelihood which leads to some disadvantages from a computational point of view. In this paper, we present a new approach to estimate the FIM in latent variable model. The advantage of our method is that only the first derivatives of the log-likelihood is needed, contrary to other approaches based on the observed FIM. Indeed we consider the empirical estimate of the covariance matrix of the score. We prove that this estimate of the Fisher information matrix is unbiased, consistent and asymptotically Gaussian. Moreover we highlight that none of both estimates is better than the other in terms of asymptotic covariance matrix. When the proposed estimate can not be directly analytically evaluated, we present a stochastic approximation estimation algorithm to compute it. This algorithm provides this estimate of the FIM as a by-product of the parameter estimates. We emphasize that the proposed algorithm only requires to compute the first derivatives of the complete data log-likelihood with respect to the parameters. We prove that the estimation algorithm is consistent and asymptotically Gaussian when the number of iterations goes to infinity. We evaluate the finite sample size properties of the proposed estimate and of the observed FIM through simulation studies in linear mixed effects models and mixture models. We also investigate the convergence properties of the estimation algorithm in non linear mixed effects models. We compare the performances of the proposed algorithm to those of other existing methods.

Information Splitting for Big Data Analytics

How Data Heterogeneity Affects Innovating Knowledge and Information in Gene Identification: A Statistical Learning Perspective

AIMS:Average Information Matrix Splitting

Application of a genomic model for high-dimensional chemometric analysis

Distributed Bootstrap Simultaneous Inference for High-Dimensional Quantile Regression

Computing log-likelihood and its derivatives for restricted maximum likelihood methods

How to estimate Fisher information matrices from simulations

Mixed Model Approaches for Detecting Influential Observations in Genetic Data Analysis

An Efficient Algorithm for Information Decomposition and Extraction.

Integrative analysis of individual-level data and high-dimensional summary statistics

Information decomposition in complex systems via machine learning

Inference for High-Dimensional Linear Mixed-Effects Models: A Quasi-Likelihood Approach

Divide and Recombine for Large and Complex Data: Model Likelihood Functions using MCMC

Information FOMO: The Unhealthy Fear of Missing Out on Information—A Method for Removing Misleading Data for Healthier Models

Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models

Mini-Hes: A Parallelizable Second-order Latent Factor Analysis Model

Statistical Inference for Large-dimensional Matrix Factor Model from Least Squares and Huber Loss Points of View

Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function

Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities

Mutual Information Estimation via $f$-Divergence and Data Derangements

Resampling-Based Multisplit Inference for High-Dimensional Regression