Impact of nonrandom selection mechanisms on the causal effect estimation for two-sample Mendelian randomization methods
Yuanyuan Yu,Lei Hou,Xu Shi,Xiaoru Sun,Xinhui Liu,Yifan Yu,Zhongshang Yuan,Hongkai Li,Fuzhong Xue
DOI: https://doi.org/10.1371/journal.pgen.1010107
IF: 4.5
2022-01-01
PLoS Genetics
Abstract:Author summaryIt is well known that nonrandom selection in one-sample Mendelian Randomization (MR) can result in biased estimates and inflated type I error rates. Actually, two-sample MR analyses are more prone to be affected by nonrandom selection than one-sample MR analyses, because two samples for genome-wide association studies (GWAS) may be selected each under different mechanisms from the source population. Summary-level genetic association statistics in two-sample MR may be derived from different study designs such as case-control, case-only and cohort studies, which further inevitably affect the causal effect estimation of exposure on outcome. In this study, we firstly propose a theorem for causal effect invariance under different selection mechanisms. In the simulation, we design 49 combinations of nonrandom selection mechanisms in sample I and sample II, which are widespread in practical applications. The simulation results reveal that the selection mechanisms in sample II have a larger influence on biases and type I error rates than those in sample I. As an illustrative example, we find the nonrandom selection in sample II (coronary heart disease patients) can magnify the causal effect estimation of obesity on the HbA1c levels. Nonrandom selection in one-sample Mendelian Randomization (MR) results in biased estimates and inflated type I error rates only when the selection effects are sufficiently large. In two-sample MR, the different selection mechanisms in two samples may more seriously affect the causal effect estimation. Firstly, we propose sufficient conditions for causal effect invariance under different selection mechanisms using two-sample MR methods. In the simulation study, we consider 49 possible selection mechanisms in two-sample MR, which depend on genetic variants (G), exposures (X), outcomes (Y) and their combination. We further compare eight pleiotropy-robust methods under different selection mechanisms. Results of simulation reveal that nonrandom selection in sample II has a larger influence on biases and type I error rates than those in sample I. Furthermore, selections depending on X+Y, G+Y, or G+X+Y in sample II lead to larger biases than other selection mechanisms. Notably, when selection depends on Y, bias of causal estimation for non-zero causal effect is larger than that for null causal effect. Especially, the mode based estimate has the largest standard errors among the eight methods. In the absence of pleiotropy, selections depending on Y or G in sample II show nearly unbiased causal effect estimations when the casual effect is null. In the scenarios of balanced pleiotropy, all eight MR methods, especially MR-Egger, demonstrate large biases because the nonrandom selections result in the violation of the Instrument Strength Independent of Direct Effect (InSIDE) assumption. When directional pleiotropy exists, nonrandom selections have a severe impact on the eight MR methods. Application demonstrates that the nonrandom selection in sample II (coronary heart disease patients) can magnify the causal effect estimation of obesity on HbA1c levels. In conclusion, nonrandom selection in two-sample MR exacerbates the bias of causal effect estimation for pleiotropy-robust MR methods.