Semantic and Cognitive Tools to Aid Statistical Science: Replace Confidence and Significance by Compatibility and Surprise
Zad Rafi,Sander Greenland
DOI: https://doi.org/10.1186/s12874-020-01105-9
2020-10-01
Abstract:Researchers often misinterpret and misrepresent statistical outputs. This abuse has led to a large literature on modification or replacement of testing thresholds and $P$-values with confidence intervals, Bayes factors, and other devices. Because the core problems appear cognitive rather than statistical, we review simple aids to statistical interpretations. These aids emphasize logical and information concepts over probability, and thus may be more robust to common misinterpretations than are traditional descriptions. We use the Shannon transform of the $P$-value $p$, also known as the binary surprisal or $S$-value $s=-\log_{2}(p)$, to measure the information supplied by the testing procedure, and to help calibrate intuitions against simple physical experiments like coin tossing. We also use tables or graphs of test statistics for alternative hypotheses, and interval estimates for different percentile levels, to thwart fallacies arising from arbitrary dichotomies. Finally, we reinterpret $P$-values and interval estimates in unconditional terms, which describe compatibility of data with the entire set of analysis assumptions. We illustrate these methods with a reanalysis of data from an existing record-based cohort study. In line with other recent recommendations, we advise that teaching materials and research reports discuss $P$-values as measures of compatibility rather than significance, compute $P$-values for alternative hypotheses whenever they are computed for null hypotheses, and interpret interval estimates as showing values of high compatibility with data, rather than regions of confidence. Our recommendations emphasize cognitive devices for displaying the compatibility of the observed data with various hypotheses of interest, rather than focusing on single hypothesis tests or interval estimates. We believe these simple reforms are well worth the minor effort they require.
Methodology,Quantitative Methods,Applications