Integration of variant annotations using deep set networks boosts rare variant association testing

Brian Clarke,Eva Holtkamp,Hakime Öztürk,Marcel Mück,Magnus Wahlberg,Kayla Meyer,Felix Munzlinger,Felix Brechtmann,Florian R. Hölzlwimmer,Jonas Lindner,Zhifen Chen,Julien Gagneur,Oliver Stegle
DOI: https://doi.org/10.1038/s41588-024-01919-z
IF: 30.8
2024-09-25
Nature Genetics
Abstract:Rare genetic variants can have strong effects on phenotypes, yet accounting for rare variants in genetic analyses is statistically challenging due to the limited number of allele carriers and the burden of multiple testing. While rich variant annotations promise to enable well-powered rare variant association tests, methods integrating variant annotations in a data-driven manner are lacking. Here we propose deep rare variant association testing (DeepRVAT), a model based on set neural networks that learns a trait-agnostic gene impairment score from rare variant annotations and phenotypes, enabling both gene discovery and trait prediction. On 34 quantitative and 63 binary traits, using whole-exome-sequencing data from UK Biobank, we find that DeepRVAT yields substantial gains in gene discoveries and improved detection of individuals at high genetic risk. Finally, we demonstrate how DeepRVAT enables calibrated and computationally efficient rare variant tests at biobank scale, aiding the discovery of genetic risk factors for human disease traits.
genetics & heredity
What problem does this paper attempt to address?