Using evidence to make decisions

Charles Jenkins
DOI: https://doi.org/10.48550/arXiv.2009.01991
2020-09-04
Instrumentation and Methods for Astrophysics
Abstract:Bayesian evidence ratios give a very attractive way of comparing models, and being able to quote the odds on a particular model seems a very clear motivation for making a choice. Jeffreys' scale of evidence is often used in the interpretation of evidence ratios. A natural question is, how often will you get it right when you choose on the basis of some threshold value of the evidence ratio? The evidence ratio will be different in different realizations of the data, and its utility can be examined in a Neyman-Pearson like way to see what the trade-offs are between statistical power (the chance of ``getting it right'') versus the false alarm rate, picking the alternative hypothesis when the null is actually true. I will show some simple examples which show that there can be a surprisingly large range for an evidence ratio under different realizations of the data. It seems best not to simply rely on Jeffrey's scale when decisions have to be taken, but also to examine the probability of taking the ``wrong'' decision if some evidence ratio is taken to be decisive. Interestingly, Turing knew this and applied it during WWII, although (like much else) he did not publish it.
What problem does this paper attempt to address?