Empirical Bayes in Bayesian learning: understanding a common practice
Stefano Rizzelli,Judith Rousseau,Sonia Petrone
2024-02-29
Abstract:In applications of Bayesian procedures, even when the prior law is carefully
specified, it may be delicate to elicit the prior hyperparameters so that it is
often tempting to fix them from the data, usually by their maximum likelihood
estimates (MMLE), obtaining a so-called empirical Bayes posterior distribution.
Although questionable, this is a common practice; but theoretical properties
seem mostly only available on a case-by-case basis. In this paper we provide
general properties for parametric models. First, we study the limit behavior of
the MMLE and prove results in quite general settings, while also
conceptualizing the frequentist context as an unexplored case of maximum
likelihood estimation under model misspecification. We cover both identifiable
models, illustrating applications to sparse regression, and non-identifiable
models - specifically, overfitted mixture models. Finally, we prove higher
order merging results. In regular cases, the empirical Bayes posterior is shown
to be a fast approximation to the Bayesian posterior distribution of the
researcher who, within the given class of priors, has the most information
about the true model's parameters. This is a faster approximation than classic
Bernstein-von Mises results. Given the class of priors, our work provides
formal contents to common beliefs on this popular practice.
Statistics Theory