Tail Strength to Combine Two P Values: Their Correlation Cannot Be Ignored
Yong Zang,Wing K. Fung,Gang Zheng
DOI: https://doi.org/10.1016/j.ajhg.2009.01.014
2009-01-01
Abstract:To the Editor: The population-based case-control study is a useful approach to evaluating genetic association with many common and complex diseases. In general, one first uses the generalized linear model to fit the data and then uses an asymptotic test to detect the true association. In addition to this regression-based analysis, when Hardy-Weinberg equilibrium (HWE) holds in the population, testing HWE in cases has been used for indicating the association. Because the regression-based analyses (including the trend test and the likelihood-ratio test) are generally more powerful than testing HWE in cases, they are often employed in case-control studies. Less attention is paid to testing HWE in cases. In the July 2008 issue of The American Journal of Human Genetics, Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar proposed a novel approach of using the tail strength to combine the p value of the likelihood-ratio test (LRT) for association and the p value of an exact test for the deviation from HWE in cases. Taylor and Tibshirani2Taylor J. Tibshirani R. A tail strength measure for assessing the overall univariate significance in a dataset.Biostatistics. 2006; 7: 167-181Crossref PubMed Scopus (62) Google Scholar originally proposed the tail strength as a measure of the overall strength of association for a large number of hypotheses in microarray analyses and genome-wide association studies (GWAS). Compared to Fisher's combination of p values3Elston R.C. On Fisher method of combining p-values.Biometrical J. 1991; 33: 339-345Crossref Scopus (49) Google Scholar, which weights each p value equally, the tail strength weights each ordered p value by its expectation under the null hypothesis. The tail strength can be used for combining independent and dependent p values and is not restricted to any special genetic model underlying the data. Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar combined the two p values by using the tail strength and extended the original tail strength by using the medians of the ordered p values as weights. They derived asymptotic null distributions for the tail strengths by applying the additive model and using the mean and median as weights, respectively. Their results showed significant improvement in terms of the power when the tail strengths were used. They also showed that the type I errors were under control, although we notice that almost all reported type I errors in their tables are less than the nominal levels. Normally, when the tail strength is used as a test statistic, its asymptotic null distribution is approximated by Monte-Carlo simulation procedures. Simulation-based approaches to determining the tail probabilities or p values of complex statistics have limitations for applications in GWAS.4Sladek R. Rocheleau G. Rung J. Dina C. Shen L. Serre D. Boutin P. Vincent D. Belisle A. Hadjadj S. et al.A genome-wide association study identifies novel risk loci for type 2 diabetes.Nature. 2007; 445: 881-885Crossref PubMed Scopus (2366) Google Scholar, 5Conneely K.N. Boehnke M. So many correlated tests, so little time! Rapid adjustment of p values for multiple correlated tests.Am. J. Hum. Genet. 2007; 81: 1158-1168Abstract Full Text Full Text PDF PubMed Scopus (339) Google Scholar In this situation, deriving their asymptotic distributions is important. Although Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar derived the asymptotic null distributions and critical values for their tail-strength statistics, they assumed in their derivations that the two p values were independent even though in the introduction section they mentioned that they would use the tail strength to combine two dependent p values. When the two p values are correlated, their asymptotic null distributions may be inappropriate. Using two test statistics different from those in Wang and Shete,1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar Zheng and Ng6Zheng G. Ng H.K.T. Genetic model selection in two-phase analysis for case-control association studies.Biostatistics. 2008; 9: 391-399Crossref PubMed Scopus (69) Google Scholar noticed that the correlation between the p values of the trend test and testing HWE between cases and controls (HWDTT7Song K. Elston R.C. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies.Stat. Med. 2006; 25: 105-126Crossref PubMed Scopus (84) Google Scholar) could also vary from the recessive (REC) model to the additive (ADD) model, the multiplicative (MUL) model, and the dominant (DOM) model. As we mentioned before, Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar considered the tail strengths based only on the ADD model. However, the performance of testing HWE in cases would also vary across the genetic models. For example, it is known that testing HWE cannot detect association under the MUL model even though testing HWE has been used for detecting association.8Nielsen D.M. Ehm M.G. Weir B.S. Detecting marker-disease association by testing for Hardy-Weinberg disequilibrium at a marker locus.Am. J. Hum. Genet. 1998; 63: 1531-1540Abstract Full Text Full Text PDF PubMed Scopus (321) Google Scholar, 9Wittke-Thompson J.K. Pluzhnikov A. Cox N.J. Rational inferences about departure from Hardy-Weinberg equilibrium.Am. J. Hum. Genet. 2005; 76: 967-986Abstract Full Text Full Text PDF PubMed Scopus (331) Google Scholar, 10Wang T. Zhu X. Elston R.C. Improving power in contrasting linkage-disequilibrium patterns between cases and controls.Am. J. Hum. Genet. 2007; 80: 911-920Abstract Full Text Full Text PDF PubMed Scopus (30) Google Scholar Therefore, the performance of the tail strength of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar can be potentially affected by two factors that were either ignored or not examined in their article. One is the correlation between the two p values of the LRT and the test for HWE in cases, and the other is the unknown underlying genetic models. In this letter, using Monte-Carlo simulation procedures, we study the correlations between the p values of the LRT and the exact test for Hardy-Weinberg proportion in cases under the four genetic models. If the two p values are indeed correlated, we examine the performance of the tail-strength statistics of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar under the null and alternative hypotheses. The analytical formula of the correlation, if any, between the LRT and the exact test for HWE used in Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar is difficult to obtain. Therefore, we consider the combination of the p values of the trend test and chi-square test for HWE between cases and controls (HWDTT), from which the asymptotic correlation between the two p values has been obtained.6Zheng G. Ng H.K.T. Genetic model selection in two-phase analysis for case-control association studies.Biostatistics. 2008; 9: 391-399Crossref PubMed Scopus (69) Google Scholar, 7Song K. Elston R.C. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies.Stat. Med. 2006; 25: 105-126Crossref PubMed Scopus (84) Google Scholar This new tail strength with the correlation is denoted by TSC. We further derive its asymptotic null distribution and critical value (see Appendix A). Comparison between our TSC and that of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar is obtained by Monte-Carlo simulations under the null and alternative hypotheses. We also denote the tail strengths based on the mean and median in Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar by TS and TSM, respectively. Here we report the main results from our simulation study. In the simulation, we assumed HWE holds in the population. In each replicate, 500 cases and 500 controls were generated under the null hypothesis with the baseline penetrance fixed at 0.02 (the probability of disease with a genotype of zero risk alleles), and minor-allele frequency (MAF) increases from 0.1 to 0.5 in increments of 0.1. We used a total of 10,000 replicates to estimate the null correlations between the two p values, the type I error rates, and power. The nominal levels 0.01 and 0.05 were used. For LRT statistics, we considered 1-degree-of-freedom tests. Therefore, for each genetic model under the alternative hypothesis (REC, ADD/MUL, and DOM), an optimal test is available for the LRT or trend test. In the simulation, we consider three LRTs and three trend tests, optimal for the three genetic models. Therefore, a total of nine tail strengths were considered in the simulation: TS, TSM, and TSC each have three model choices depending on the targeted genetic model. The results of the null correlations between the two p values in Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar and corresponding type I errors are reported in Table 1 for the nominal level 0.01 and in Table 2 for the nominal level 0.05.Table 1Simulated Null Correlations of the Two p Values of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar and the Asymptotic Type I Errors with Nominal Level 0.01MAFModelSimulated Null CorrelationsTSTSMTSC0.1REC0.27020.02620.02720.0067ADD−0.00490.00560.00570.0081DOM0.02380.00720.0070.0090.2REC0.23270.02480.02510.0123ADD0.00180.00920.00920.0108DOM0.03280.01290.01280.01160.3REC0.16720.02680.0260.0131ADD0.01870.01040.01010.0112DOM0.05050.0170.0170.00920.4REC0.14540.02250.02250.0118ADD−0.01490.00760.00740.0083DOM0.07160.01570.01530.00920.5REC0.10370.01970.02010.0128ADD−0.00470.00740.00770.0103DOM0.09190.01740.01750.0081TS uses means as weights, TSM uses medians as weights, and TSC is the proposed test with the correlations. Three genetic models, which are only used for constructing the optimal LRTs (for TS and TSM) and optimal-trend tests (for TSC), are considered. Open table in a new tab Table 2Simulated Null Correlations of the Two p Values of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar and the Asymptotic Type I Errors with Nominal Level 0.05MAFModelSimulated Null CorrelationsTSTSMTSC0.1REC0.27020.08410.08320.0436ADD−0.00490.03940.04030.0477DOM0.02380.04090.04080.04620.2REC0.23270.07650.07720.0476ADD0.00180.04430.04410.0557DOM0.03280.05130.05240.04720.3REC0.16720.07270.0720.0482ADD0.01870.04380.04380.0468DOM0.05050.05190.05240.05310.4REC0.14540.06950.06780.0458ADD−0.01490.05190.05160.0521DOM0.07160.05740.05690.04940.5REC0.10370.07220.07040.0481ADD−0.00470.04740.04830.0509DOM0.09190.06470.06330.0475TS uses means as weights, TSM uses medians as weights, and TSC is the proposed test with the correlations. Three genetic models, which are only used for constructing the optimal LRTs (for TS and TSM) and optimal-trend tests (for TSC), are considered. Open table in a new tab TS uses means as weights, TSM uses medians as weights, and TSC is the proposed test with the correlations. Three genetic models, which are only used for constructing the optimal LRTs (for TS and TSM) and optimal-trend tests (for TSC), are considered. TS uses means as weights, TSM uses medians as weights, and TSC is the proposed test with the correlations. Three genetic models, which are only used for constructing the optimal LRTs (for TS and TSM) and optimal-trend tests (for TSC), are considered. The results in Table 1, Table 2 follow similar patterns. Thus, we focus on Table 1. The simulated null correlations between the p value of LRT and the p value of the exact HWE test in cases indicate that the null correlations are not zero when the LRT is optimal for the REC or DOM models, but they are close to zero for the ADD (MUL) model. Hence, the type I errors of the TS and TSM of Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar are under control when the LRT is optimal for the ADD (MUL) model but are largely inflated when the LRTs are optimal for the REC and DOM models, especially for the REC model. Note that Wang and Shete1Wang J. Shete S. A test for genetic association that incorporates information about deviation from Hardy-Weinberg proportions in cases.Am. J. Hum. Genet. 2008; 83: 53-63Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar only considered the LRT optimal for the ADD model. Therefore, their type I errors were under control. On the other hand, the type I errors of TSC, which takes care of the correlations, are close to the nominal level regardless of the targeted genetic models. We also conducted simulations to compare the powers of the TS, TSM, and TSC. For the TS and TSM, the correlations between the two p values were not incorporated. Thus, on the basis of results in Table 1, Table 2, their powers could be inflated under the REC and DOM models, but not under the ADD and MUL models. The powers are presented for the TS, TSM, and TSC (from left to right) under the REC model (Figure 1) and ADD model (Figure 2). The plots for the MUL and DOM models can be found in the Supplemental Data available online (Figures S1 and S2, respectively). The parameter values of the simulations under the alternative hypotheses are similar to those in Table 1, Table 2, except that the genotype relative risk (gamma2, which is defined as the ratio of penetrances with two risk alleles to those with zero risk alleles) ranges from 1 to 2, and the MAF is fixed at 0.3. The “asymptotic” and “simulated” powers in the figures were based on the critical values obtained from 10,000 parametric bootstrap samples and 10,000 permutations, respectively.Figure 2The Asymptotic and Simulated Powers under the ADD ModelShow full captionThe tests from left to right are TS, TSM, and TSC. Gamma2 is the ratio of penetrances with two risk alleles to zero risk allele.View Large Image Figure ViewerDownload Hi-res image Download (PPT) The tests from left to right are TS, TSM, and TSC. Gamma2 is the ratio of penetrances with two risk alleles to zero risk allele. Figure 1 (under the REC model) shows that TS and TSM have similar powers and are more powerful than TSC. This could be due to the fact that TS and TSM had inflated type I errors as shown in Table 1, Table 2. On the other hand, Figure 2 shows that the powers of TS, TSM, and TSC are similar under the ADD model because the three statistics had similar type I errors. For the TS and TSM, the bootstrap and permutation procedures yield similar powers under the ADD, MUL, and DOM models but have slightly different powers under the REC model. We also studied empirical powers of the TSC, the optimal trend test, a robust test MAX311Freidlin B. Zheng G. Li Z. Gastwirth J.L. Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 2002; 53: 146-152Crossref PubMed Scopus (264) Google Scholar, and classical Pearson's test for association under the four genetic models. The description and summary of our findings are given in Appendix B. The results show that the TSC has moderate power improvement under the REC model but loses significant power under the ADD and MUL models. This can be explained by the fact that testing HWE has little power under the ADD and MUL models. In summary, the tail strength may improve power under some specific genetic models after correction for the correlation. However, when the underlying genetic model is unknown, the robust statistics are more preferable.6Zheng G. Ng H.K.T. Genetic model selection in two-phase analysis for case-control association studies.Biostatistics. 2008; 9: 391-399Crossref PubMed Scopus (69) Google Scholar, 11Freidlin B. Zheng G. Li Z. Gastwirth J.L. Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 2002; 53: 146-152Crossref PubMed Scopus (264) Google Scholar We would like to thank Yaning Yang for some helpful discussions that brought our attention to the tail strength. The work of Y. Zang and W.K. Fung were partially supported by The Croucher Foundation and China Natural Science Foundation (no. 10701067). Denote the HWDTT by Z∗, which is a statistic testing HWE between cases and control and was proposed by Song and Elston.7Song K. Elston R.C. A powerful method of combining measures of association and Hardy-Weinberg disequilibrium for fine-mapping in case-control studies.Stat. Med. 2006; 25: 105-126Crossref PubMed Scopus (84) Google Scholar Denote the trend test as Zx, where x = 0, 0.5, and 1 for the REC, ADD (MUL), and DOM models, respectively.11Freidlin B. Zheng G. Li Z. Gastwirth J.L. Trend tests for case-control studies of genetic markers: power, sample size and robustness.Hum. Hered. 2002; 53: 146-152Crossref PubMed Scopus (264) Google Scholar, 12Sasieni P.D. From genotype to genes: doubling the sample size.Biometrics. 1997; 53: 1253-1261Crossref PubMed Scopus (751) Google Scholar, 13Zheng G. Freidlin B. Li Z. Gastwirth J.L. Choice of scores in trend tests for case-control studies of candidate-gene associations.Biometrical J. 2003; 45: 335-348Crossref Scopus (59) Google Scholar Under the null hypothesis H0, (Z∗, Zx) follows the bivariate normal distribution N(0, Σ1) with the density function f1, where Σ1=(1ρxρx1), and (−Z∗, Zx) follows the bivariate distribution N(0, Σ2) with the density function f2, where Σ2=(1−ρx−ρx1). The expressions for ρx were given in Zheng and Ng for different x values.6Zheng G. Ng H.K.T. Genetic model selection in two-phase analysis for case-control association studies.Biostatistics. 2008; 9: 391-399Crossref PubMed Scopus (69) Google Scholar The following derivation can be modified to the tail strength of any two correlated p values. The p value of Z∗ is P∗ = 2Φ(− |z∗|), and the p value of Zx is Px = 2Φ(− |zx|), where Φ is the cumulative distribution function of the standard normal N(0, 1), and z∗ and zx are observed statistics. Then the joint distribution of P∗ and Px is:F(x1,x2)=Pr(P∗Φ−1(1−x12),|Zx|>Φ−1(1−x22))=Pr(Z∗<Φ−1(x12),Zx<Φ−1(x22))+Pr(Z∗<Φ−1(x12),−Zx<Φ−1(x22))+Pr(−Z∗<Φ−1(x12),Zx<Φ−1(x22))+Pr(−Z∗<Φ−1(x12),−Zx<Φ−1(x22))=2∫−∞Φ−1(x12)∫−∞Φ−1(x22)f1(y1,y2)dy1dy2+2∫−∞Φ−1(x12)∫−∞Φ−1(x22)f2(y1,y2)dy1dy2. Thus, its density function can be written asf(x1,x2)=∂F(x1,x2)∂x1∂x2=∑i=01exp[−{Φ−1(x12)}2+{Φ−1(x22)}2+(−1)i2ρx{Φ−1(x12)}{Φ−1(x22)}2(1−ρx2)]×{21−ρx2exp[−{Φ−1(x12)}2+{Φ−1(x22)}22]}−1. Therefore, the ordered p values have the cumulative function given byF(x(1),x(2))=Pr(P(1)