Abstract:In genetic association studies, detecting disease-genotype associations is a primary goal. For most diseases, the underlying genetic model is unknown, and we study seven robust test statistics for monotone association. For a given test statistic, there are many ways to calculate a p-value, but in genetic association studies, calculations have predominantly been based on asymptotic approximations or on simulated permutations. We show that when the number of permutations tends to infinity, the permutation p-value approaches the exact conditional enumeration p-value, and further that calculating the latter p-value is much more efficient than performing simulated permutations. We then answer two research questions. (i) Which of the test statistics under study are the most powerful for monotone genetic models? (ii) Based on test size, power, and computational considerations, should asymptotic approximations or exact conditional enumeration be used for calculating p-values? We have studied case-control sample sizes with 500-5000 cases and 500-15000 controls, and significance levels from 5e-8 to 0.05, thus our results are applicable to genetic association studies with only one genetic marker under study, intermediate follow-up studies, and genome wide association studies. We find that if all monotone genetic models are of interest, the best performance is achieved for a test statistics based on the maximum over a range of Cochrane-Armitage trend tests with different scores and for a constrained likelihood ratio test. For significance levels below 0.05, asymptotic approximations may give a test size up to 20 times the nominal level, and should therefore be used with caution. Further, calculating p-values based on exact conditional enumeration is a powerful, valid and computationally feasible approach, and we advocate its use in genetic association studies.

Robust Methods for Disease-Genotype Association in Genetic Association Studies: Calculate P-values Using Exact Conditional Enumeration instead of Asymptotic Approximations

Exact conditional p-values from arbitrary ranking of a sample space: An application to genome-wide association studies

Hypothesis testing at the extremes: fast and robust association for high-throughput data

Fast permutation tests and related methods, for association between rare variants and binary outcomes

Single-Locus Genetic Association Analysis By Ordinal Tests

A powerful MAF-neutral allele-based test for case-control association studies

Efficient and powerful familywise error control in genome-wide association studies using generalized linear models

Statistical power and significance testing in large-scale genetic studies

Pooled Association Tests for Rare Genetic Variants: A Review and Some New Results

Retrospective Versus Prospective Score Tests For Genetic Association With Case-Control Data

Analysis of Case-Control Association Studies: SNPs, Imputation and Haplotypes

A Shrinkage Method for Testing the Hardy–Weinberg Equilibrium in Case‐Control Studies

Permutation p-values should never be zero: calculating exact p-values when permutations are randomly drawn

Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study.

Power Analysis of Principal Components Regression in Genetic Association Studies.

Kernel-smoothed permutation for extreme P-value estimation in genetic association studies

A Nonparametric Alternative to the Cochran-Armitage Trend Test in Genetic Case-Control Association Studies: the Jonckheere-Terpstra Trend Test

Powerful extreme phenotype sampling designs and score tests for genetic association studies

Fast Approximation of Small P-values in Permutation Tests by Partitioning the Permutations

Linear Models for Analysis of Multiple Single Nucleotide Polymorphisms with Quantitative Traits in Unrelated Individuals

Alternative Methods for H1 Simulations in Genome Wide Association Studies