Doublethink: simultaneous Bayesian-frequentist model-averaged hypothesis testing

Helen R. Fryer,Nicolas Arning,Daniel J. Wilson

2024-07-27

Abstract:Establishing the frequentist properties of Bayesian approaches widens their appeal and offers new understanding. In hypothesis testing, Bayesian model averaging addresses the problem that conclusions are sensitive to variable selection. But Bayesian false discovery rate (FDR) guarantees are contingent on prior assumptions that may be disputed. Here we show that Bayesian model-averaged hypothesis testing is a closed testing procedure that controls the frequentist familywise error rate (FWER) in the strong sense. The rate converges pointwise as the sample size grows and, under some conditions, uniformly. The `Doublethink' method computes simultaneous posterior odds and asymptotic p-values for model-averaged hypothesis testing. We explore its benefits, including post-hoc variable selection, and limitations, including finite-sample inflation, through a Mendelian randomization study and simulations comparing approaches like LASSO, stepwise regression, the Benjamini-Hochberg procedure and e-values.

Methodology,Statistics Theory

What problem does this paper attempt to address?

The paper attempts to address the following issues: 1. **Bayesian Methods in Model Averaging Hypothesis Testing**: Bayesian model averaging addresses the sensitivity of conclusions due to variable selection in hypothesis testing. However, the Bayesian false discovery rate (FDR) guarantee relies on potentially controversial prior assumptions. 2. **Unification of Bayesian and Frequentist Methods**: The paper demonstrates that Bayesian model averaging hypothesis testing is a closed testing procedure that can control the frequentist family-wise error rate (FWER) in a strong sense. As the sample size increases, the error rate converges pointwise and, under certain conditions, uniformly. 3. **Doublethink Method**: A new method called "Doublethink" is proposed, which can simultaneously compute the posterior odds and asymptotic p-values for model averaging hypothesis testing. This method is compared with others (such as LASSO, stepwise regression, the Benjamini-Hochberg procedure, and e-values) through Mendelian randomization studies and simulations. In summary, the paper aims to enhance the robustness and interpretability of hypothesis testing by combining Bayesian and frequentist methods, especially in the context of large-scale datasets.

Doublethink: simultaneous Bayesian-frequentist model-averaged hypothesis testing

A frequentist two-sample test based on Bayesian model selection

Model-averaged Bayesian t tests

Identifying direct risk factors in UK Biobank with simultaneous Bayesian-frequentist model-averaged hypothesis testing using Doublethink

Simultaneous inference: When should hypothesis testing problems be combined?

Factors in a chloroplast extract specifically bind to the 5' untranslated regions of chloroplast mRNAs.

Consistent estimation of the proportion of false nulls and FDR for adaptive multiple testing Normal means under weak dependence

A puzzle of proportions: Two popular Bayesian tests can yield dramatically different conclusions

Multiple testing with the structure adaptive Benjamini-Hochberg algorithm

Multiple Testing in Nonparametric Hidden Markov Models: An Empirical Bayes Approach

Compound e-values and empirical Bayes

Smaller $p$-values in genomics studies using distilled historical information

The flaw of averages: Bayes factors as posterior means of the likelihood ratio

Comparing researchers' degree of dichotomous thinking using frequentist versus Bayesian null hypothesis testing

Statistical significance testing for mixed priors: a combined Bayesian and frequentist analysis

Empirical partially Bayes multiple testing and compound $χ^2$ decisions

Mechanistic implications of cyclic ADP-ribose hydrolysis and methanolysis catalyzed by calf spleen NAD+glycohydrolase.

Empirical Bayes large-scale multiple testing for high-dimensional binary outcome data

Empirical Bayes factors for common hypothesis tests

Confidence distributions and hypothesis testing

A New Multiple Testing Method in the Dependent Case