mBAT-combo: a more powerful test to detect gene-trait associations from GWAS data
Ang Li,Shouye Liu,Andrew Bakshi,Longda Jiang,Wenhan Chen,Zhili Zheng,Patrick F. Sullivan,Peter M. Visscher,Naomi R. Wray,Jian Yang,Jian Zeng,Li,A.,Liu,S.,Bakshi,A.,Jiang,L.,Chen,W.,Zheng,Z.,Sullivan,P. F.,Visscher,P. M.,Wray,N. R.,Yang,J.,Zeng,J.
DOI: https://doi.org/10.1101/2022.06.27.497850
2022-06-30
bioRxiv
Abstract:Gene-based association tests aggregate multiple SNP-trait associations into sets defined by gene boundaries. Since genes have a direct biological link to downstream function, gene-based test results are widely used in post-GWAS analysis. A common approach for gene-based tests is to combine SNPs associations by computing the sum of chi-squared statistics. However, this strategy ignores the directions of SNP effects, which could result in a loss of power for SNPs with masking effects (e.g., when the product of the effects of two SNPs and their linkage disequilibrium (LD) correlation is negative). Here, we introduce mBAT-combo, a new set-based test that is better powered than other methods to detect multi-SNP associations in the context of masking effects. We validate the method through simulations and applications to real data. We find that of 35 blood and urine biomarker traits in the UK Biobank, 34 traits show evidence for masking effects in a total of 4,175 gene-trait pairs, indicating that masking effects in complex traits is common. We further validate the improved power of our method in height, body mass index and schizophrenia with different GWAS sample sizes and show that on average 95.7% of the genes detected only by mBAT-combo with smaller sample sizes can be identified by the single-SNP approach with larger sample sizes (average sample size increased by 1.7-fold). For instance, LRRC4B is significant only in our method for schizophrenia, which has been shown to play a role in presynaptic pathology using genetic fine-mapping and evidence-based synaptic annotations. As a more powerful gene-based method, mBAT-combo is expected to improve the downstream pathway analysis or tissue and cell-type enrichment analysis that takes genes identified from GWAS data as input to understand the biological mechanisms of the trait or disease. Despite our focus on genes in this study, the framework of mBAT-combo is general and can be applied to any set of SNPs to refine trait-association signals hidden in genomic regions with complex LD structures.