Abstract:Epidemiological analyses often rely on p-values, which have lost their classical insight. We reinforce the classical interpretation of p-values in randomized experiments, especially in settings with “big data” and consequently many tests. Fisher first introduced the concept of p-value by tying it to the need for randomization of units to treatments. He proposed that researchers assess the sharp null hypothesis by conducting a randomization test summarized by a p-value. First, one choses a suitable test statistic, S, and one calculates its observed value, say S*. Then, one constructs the S's distribution induced by the randomization under the null hypothesis. To obtain such a randomization distribution, one enumerates all possible treatment assignments (Ntotal) based on the assignment mechanism, and for each, one calculates the value of S that would have been observed with that assignment. The proportion of such values of S across the possible randomizations that are as large or larger than S* is the p-value. To illustrate, consider a randomized experiment with 20 participants and 3 treatments given to each in random order so there are six (=3*2*1) possible sequences of treatments. Whatever the test statistic, the minimum p-value that can be achieved equals 1/Ntotal (i.e., 1/(20*6)≈0.0083), which is achieved when S* is the largest of all possible values of S. Ntotal depends on the number of units and treatments. For example, Bonferroni adjustments (e.g., dividing the significance level by the number of tests), often used to “correct” for multiple testing in environmental studies with high-dimensional outcomes (e.g., methylation on 450,000 CpG sites), ignore the classical insight of randomization-based p-values, because it is applied to model-based p-values, which are not justified by the study design. Here, applying Bonferroni with thousands of tests would yield a nonsensical “corrected” significance level less than the minimal achievable p-value.

Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn

Fast Approximation of Small P-values in Permutation Tests by Partitioning the Permutations

A statistical method for the conservative adjustment of false discovery rate (q-value)

Efficiently estimating small p-values in permutation tests using importance sampling and cross-entropy method

Accurate and Fast Small P-Value Estimation for Permutation Tests in High-Throughput Genomic Data Analysis with the Cross-Entropy Method.

Enhanced adaptive permutation test with negative binomial distribution in genome-wide omics datasets

Kernel-smoothed permutation for extreme P-value estimation in genetic association studies

Permutation p-value approximation via generalized Stolarsky invariance

High-Dimensional Randomized Crossover Studies: A Clarification of P-Values Interpretation

Robust Methods for Disease-Genotype Association in Genetic Association Studies: Calculate P-values Using Exact Conditional Enumeration instead of Asymptotic Approximations

Randomized p-values for multiple testing and their application in replicability analysis

Exact conditional p-values from arbitrary ranking of a sample space: An application to genome-wide association studies

A Monte Carlo Permutation Test for Random Mating Using Genome Sequences

Another look at the Lady Tasting Tea and differences between permutation tests and randomization tests

Fast permutation tests and related methods, for association between rare variants and binary outcomes

Minor Issues Escalated to Critical Levels in Large Samples: A Permutation-Based Fix

PERMUTOOLS: A MATLAB Package for Multivariate Permutation Testing

When possible, report a Fisher-exact P value and display its underlying null randomization distribution

Bagged Empirical Null p-values: A Method to Account for Model Uncertainty in Large Scale Inference

An exact method to compute a $p$-value for the beyond-pairwise correlations among cancer gene mutations

Multiple testing of composite null hypotheses for discrete data using randomized $p$-values