Computationally efficient meta-analysis of gene-based tests using summary statistics in large-scale genetic studies
Tyler Joseph,Joelle Mbatchou,Arkopravo Ghosh,Anthony Marcketta,Christopher Gillies,Jing Tang,Priyanka Nakka,Xinyuan Zhang,Jack Kosmicki,Carlo Sidore,Lauren Gurski,Regeneron Genetics Center,Maya Ghoussaini,Manuel A.R. Ferreira,Goncalo Abecasis,Jonathan Marchini
DOI: https://doi.org/10.1101/2024.12.06.24318617
2024-12-08
Abstract:Meta-analysis of gene-based tests using single variant summary statistics is a powerful strategy for associating genes with disease. However, current approaches require sharing the covariance matrix between variants for each study and trait of interest. For large-scale studies with many phenotypes, these matrices can be cumbersome to calculate, store, and share. To address this challenge, we present REMETA, an efficient tool for meta-analysis of gene-based tests. REMETA uses a single sparse covariance reference file per study that is rescaled for each phenotype using single variant summary statistics. We develop methods to apply REMETA to binary traits with case-control imbalance, and estimate allele frequencies, genotype counts and effect sizes of burden tests. We demonstrate the performance and advantages of our approach via meta-analysis of 5 traits in 469,376 samples in UK Biobank. The open-source REMETA software tools and framework will facilitate meta-analysis across large scale exome sequencing studies from diverse studies that cannot easily be brought together.