Is the Tail-Strength Measure More Powerful in Tests of Genetic Association?
Shizhong Han,Li Ma,Dawei Li,Bao Zhu Yang
DOI: https://doi.org/10.1016/j.ajhg.2009.01.015
2009-01-01
Abstract:To the Editor: It is well known that Hardy-Weinberg equilibrium (HWE) is an important property in population genetics. Deviation from HWE among cases can provide evidence for a valid association.1Feder J.N. Gnirke A. Thomas W. Tsuchihashi Z. Ruddy D.A. Basava A. Dormishian F. Domingo Jr., R. Ellis M.C. Fullan A. et al.A novel MHC class I-like gene is mutated in patients with hereditary haemochromatosis.Nat. Genet. 1996; 13: 399-408Crossref PubMed Scopus (3278) Google Scholar, 2Nielsen D.M. Ehm M.G. Weir B.S. Detecting marker-disease association by testing for HW disequilibrium at a marker locus.Am. J. Hum. Genet. 1998; 63: 1531-1540Abstract Full Text Full Text PDF PubMed Scopus (317) Google Scholar, 3Czika W. Weir B.S. Properties of the multiallelic trend test.Biometrics. 2004; 60: 69-74Crossref PubMed Scopus (16) Google Scholar, 4Wittke-Thompson J.K. Pluzhnikov A. Cox N.J. Rational inferences about departures from HW equilibrium.Am. J. Hum. Genet. 2005; 76: 967-986Abstract Full Text Full Text PDF PubMed Scopus (324) Google Scholar Thus, it would be advisable to incorporate information from the HWE test for the improvement of power in detecting associated variants in genetic association studies. In the July 2008 issue of The Journal, Wang et al.5Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (31) Google Scholar described a test statistic, the tail-strength (TS) measure,6Taylor J. Tibshirani R. A tail strength measure for assessing the overall univariate significance in a dataset.Biostatistics. 2006; 7: 167-181Crossref PubMed Scopus (61) Google Scholar for evaluation of the global null hypothesis, that the SNP was not associated with disease, which is a function of two p values: one from a logistic-regression test in a genetic association study and one from a HWE test in cases. The authors further extended the mean-based TS measure to a median-based measure (TSM) by measuring the deviation of each p value from its median value instead of its expected value. On the basis of simulation studies and real disease studies, the authors stated that the adopted TS measure was more powerful than the traditional logistic-regression test and that the type I error was also well controlled. However, we have two main concerns about these conclusions. First, the two assumptions, which are required for deriving the exact distribution of TS and TSM statistics under the null hypothesis, hold only in certain scenarios. Violation of the two assumptions fails to obtain the exact distribution that the authors derived. The first assumption is that the two p values, of the HWE test and the logistic-regression test, are independent. This assumption may be violated, given that the two tests use the same case data. The authors mention that the adopted TS measure allows dependence between individual tests; yet, how to take the correlation into consideration was not discussed. The second assumption presumes the null distribution of HWE test p values to be uniform (0, 1), and this is breached when the exact test is applied for assessment of HWE.7Rohlfs R.V. Weir B.S. Distributions of Hardy-Weinberg Equilibrium Test Statistics.Genetics. 2008; 180: 1609-1616Crossref PubMed Scopus (78) Google Scholar To evaluate the validity of these two assumptions, we generated 10,000 data sets of cases and controls (500 and 500) at various minor-allele frequencies (MAFs) (0.05, 0.1, 0.2, 0.3, 0.4, and 0.5) under the null hypothesis of no association between the SNP and disease status. We first assessed the correlations between the two p values for the HWE exact test and the association test (likelihood-ratio test) under four different genetic models, including additive, genotypic, dominant, and recessive models, respectively (Table 1). “Genotypic model” means that the genotypes are coded as categorical variables and the major-allele homozygote is taken as a reference. We observed nonignorable correlations between the two p values when a genotypic effect was modeled, whereas the correlations were low when an additive effect was modeled at different MAFs. For dominant and recessive models, the correlations between the two p values were not stable but were, instead, dependent on the MAFs. We also found that the HWE exact test p values did not correspond to a uniform distribution when we used the same simulated data, as described above (Figure 1). At the MAF of 0.05, the p values of the HWE tests were skewed to the left and did not follow a uniform distribution, and the degree of skew gradually diminished as the MAF increased. It is well known that under the null hypothesis, the p value based on a continuous test statistic has a uniform distribution over the interval [0,1], regardless of the sample size of the experiment.8Hung H.M. O'Neill R.T. Bauer P. Köhne K. The behavior of the P-value when the alternative hypothesis is true.Biometrics. 1997; 53: 11-22Crossref PubMed Scopus (131) Google Scholar What we observed in Figure 1 is due to the discreteness of the test. In essence, the HWE exact test is based on a discrete hypergeometric distribution of the data under the null hypothesis of HWE. Given the sample size and the MAF, only a finite number of possible distinct genotype configurations exists, and therefore, a finite number of possible p values generates a coarse distribution. When the sample size or MAF is small, the number of possible distinct genotype configurations will be small, with specific observed probabilities that deviate the distribution of p values from uniform. In particular, a spike close to a p value of 1 will appear for the most frequent sample configuration. As the sample size or MAF increases, more distinct genotype configurations increasingly resemble a uniform distribution.Table 1The p Value Correlation Coefficient between the HWE Exact Test and the Likelihood-Ratio Association TestGenetic ModelMinor-Allele Frequency0.050.10.20.30.40.5Additive−0.029−0.00630.00140.0140.012−0.0004GenotypicaThe genotypic model means that the genotypes are coded as categorical variables and the major-allele homozygote is taken as a reference.0.140.300.270.270.280.27Dominant−0.00370.0240.0410.0630.100.12Recessive0.160.290.220.170.140.095The results are based on 10,000 simulated data sets under the null hypothesis.a The genotypic model means that the genotypes are coded as categorical variables and the major-allele homozygote is taken as a reference. Open table in a new tab The results are based on 10,000 simulated data sets under the null hypothesis. To further evaluate the influence of these two assumptions on the analytical distribution of the TS and TSM statistics that the authors derived in the paper, we compared the analytical distribution of TS and TSM with the empirical distribution under different genetic models and MAF combinations, using the same simulated data sets. Results are shown in Figure S1 (available online). Specifically, for the additive model, when the MAF equals 0.05 or 0.1, the analytical p value is larger than the empirical p value at the right-hand tail of the distribution, which is more evident at a MAF of 0.05. However, the analytical and empirical distributions match pretty well at MAFs ≥ 0.2. Notably, when a genotypic model is used for calculation of the association test p value, the corresponding analytical and empirical distributions of the TS and TSM statistics do not fit well, regardless of the MAF, which can be attributed mainly to the nonignorable correlation between the two tests at various MAFs (violation of the independence assumption of the two tests). A similar result exists for a dominant or recessive genetic model when either the correlation between the two individual tests is nonignorable or the MAF is low (an MAF = 0.05 or 0.1). As an example, Figure 2 shows the empirical and the analytical distribution of the TS and TSM statistics at a MAF of 0.2 under the null hypothesis. We found that the analytical and the empirical distributions fit well for the additive model. However, it is obvious that the empirical and analytical distributions mismatch for a genotypic-effect model. The exact p value is less than empirical p value at the tails of the distribution for both TS and TSM. Therefore, we suggest the use of empirical distributions rather than exact distributions for TS and TSM measures in the practice of genetic association studies. Second, we are concerned with the power of the TS and TSM statistics. To evaluate the empirical power, we performed another 10,000 simulations using the same parameters implemented in Wang et al.; i.e., simulation of data 1 in model 1 with β0=−2 and β1=0.3. We calculated the power of the two marginal tests. Under the 0.05 significance level, the power for the HWE test is 0.053, which is only slightly larger than the type I error, and the power of additive effects in logistic regression is 0.59. Surprisingly, the empirical power of additive effects for the TS and TSM statistics, which combine the p values of the HWE exact test and the association test, is only 0.25 (0.27), which is lower than what the authors reported in their Table 4. We are unable to explain this discrepancy. In summary, the TS statistic is useful for combining the information from HWE test and case-control association test to improve the power of detecting SNP effects in genetic association studies. However, one needs to be cautious when using this statistic. On the basis of simulation results, we found that the analytical distributions of the TS and TSM statistics are influenced both by the MAF and by genetic models used in association tests. We suggest using the empirical p value, rather than the exact p value, in real situations. A more generalized statistic that does not depend on HWE-test significance in cases should be developed for the incorporation of HWE information and improvement of the power of genetic association studies. This work was supported by the U.S. National Institutes of Health (career development award K01 DA024758 to B.Z.Y.). We are grateful for Joel Gelernter's valuable discussion and comments on this letter. Download .pdf (.35 MB) Help with pdf files Document S1. One Figure Tail Strength to Combine Two p Values: Their Correlation Cannot Be IgnoredZang et al.The American Journal of Human GeneticsFebruary 13, 2009In BriefTo the Editor: The population-based case-control study is a useful approach to evaluating genetic association with many common and complex diseases. In general, one first uses the generalized linear model to fit the data and then uses an asymptotic test to detect the true association. In addition to this regression-based analysis, when Hardy-Weinberg equilibrium (HWE) holds in the population, testing HWE in cases has been used for indicating the association. Because the regression-based analyses (including the trend test and the likelihood-ratio test) are generally more powerful than testing HWE in cases, they are often employed in case-control studies. Full-Text PDF Open ArchiveResponse to Zang et al. and Han et al.Wang et al.The American Journal of Human GeneticsFebruary 13, 2009In BriefTo the Editor: In July 2008, we proposed a powerful test for the study of genetic association that incorporates information about deviation from Hardy-Weinberg proportions (HWP) in cases.1 Two approaches were proposed: the mean-based tail-strength (TS) measure and the median-based tail-strength (TSM) measure. These measures combined p values from the likelihood ratio test (LRT) for association and the exact test for HWP. For both measures, we derived exact formulas to compute p values, and we also provided an approach for obtaining empirical p values with the use of a resampling procedure. Full-Text PDF Open Archive