A robust and adaptive framework for interaction testing in quantitative traits between multiple genetic loci and exposure variables
Julian Hecker,Dmitry Prokopenko,Matthew Moll,Sanghun Lee,Wonji Kim,Dandi Qiao,Kirsten Voorhies,Woori Kim,Stijn Vansteelandt,Brian D Hobbs,Michael H Cho,Edwin K Silverman,Sharon M Lutz,Dawn L DeMeo,Scott T Weiss,Christoph Lange
DOI: https://doi.org/10.1371/journal.pgen.1010464
2022-11-16
Abstract:The identification and understanding of gene-environment interactions can provide insights into the pathways and mechanisms underlying complex diseases. However, testing for gene-environment interaction remains a challenge since a.) statistical power is often limited and b.) modeling of environmental effects is nontrivial and such model misspecifications can lead to false positive interaction findings. To address the lack of statistical power, recent methods aim to identify interactions on an aggregated level using, for example, polygenic risk scores. While this strategy can increase the power to detect interactions, identifying contributing genes and pathways is difficult based on these relatively global results. Here, we propose RITSS (Robust Interaction Testing using Sample Splitting), a gene-environment interaction testing framework for quantitative traits that is based on sample splitting and robust test statistics. RITSS can incorporate sets of genetic variants and/or multiple environmental factors. Based on the user's choice of statistical/machine learning approaches, a screening step selects and combines potential interactions into scores with improved interpretability. In the testing step, the application of robust statistics minimizes the susceptibility to main effect misspecifications. Using extensive simulation studies, we demonstrate that RITSS controls the type 1 error rate in a wide range of scenarios, and we show how the screening strategy influences statistical power. In an application to lung function phenotypes and human height in the UK Biobank, RITSS identified highly significant interactions based on subcomponents of genetic risk scores. While the contributing single variant interaction signals are weak, our results indicate interaction patterns that result in strong aggregated effects, providing potential insights into underlying gene-environment interaction mechanisms.