Abstract:Background: Two-sample summary-data Mendelian randomization (MR) incorporating multiple genetic variants within a meta-analysis framework is a popular technique for assessing causality in epidemiology. If all genetic variants satisfy the instrumental variable (IV) and necessary modelling assumptions, then their individual ratio estimates of causal effect should be homogeneous. Observed heterogeneity signals that one or more of these assumptions could have been violated. Methods: Causal estimation and heterogeneity assessment in MR require an approximation for the variance, or equivalently the inverse-variance weight, of each ratio estimate. We show that the most popular 'first-order' weights can lead to an inflation in the chances of detecting heterogeneity when in fact it is not present. Conversely, ostensibly more accurate 'second-order' weights can dramatically increase the chances of failing to detect heterogeneity when it is truly present. We derive modified weights to mitigate both of these adverse effects. Results: Using Monte Carlo simulations, we show that the modified weights outperform first- and second-order weights in terms of heterogeneity quantification. Modified weights are also shown to remove the phenomenon of regression dilution bias in MR estimates obtained from weak instruments, unlike those obtained using first- and second-order weights. However, with small numbers of weak instruments, this comes at the cost of a reduction in estimate precision and power to detect a causal effect compared with first-order weighting. Moreover, first-order weights always furnish unbiased estimates and preserve the type I error rate under the causal null. We illustrate the utility of the new method using data from a recent two-sample summary-data MR analysis to assess the causal role of systolic blood pressure on coronary heart disease risk. Conclusions: We propose the use of modified weights within two-sample summary-data MR studies for accurately quantifying heterogeneity and detecting outliers in the presence of weak instruments. Modified weights also have an important role to play in terms of causal estimation (in tandem with first-order weights) but further research is required to understand their strengths and weaknesses in specific settings.

Optimal Covariate Weighting Increases Discoveries in High-throughput Biology

A Weighted Prognostic Covariate Adjustment Method for Efficient and Powerful Treatment Effect Inferences in Randomized Controlled Trials

The covariate-adjusted residual estimator and its use in both randomized trials and observational settings

A Non-Parametric Method for Building Predictive Genetic Tests on High-Dimensional Data

Robust weights that optimally balance confounders for estimating marginal hazard ratios

Gene-based genetic association test with adaptive optimal weights.

Leveraging historical data to optimize the number of covariates and their explained variance in the analysis of randomized clinical trials.

Model-free selective inference under covariate shift via weighted conformal p-values

Balancing Weights for Causal Inference in Observational Factorial Studies

Optimal False Discovery Rate Control for Large Scale Multiple Testing with Auxiliary Information

A Powerful Approach to Test an Optimally Weighted Combination of Rare Variants in Admixed Populations

Asymptotic properties of covariate-adaptive randomization

Alternative weighting schemes for fine-tuned extended similarity index calculations

Optimal transport weights for causal inference

Multiple multi-sample testing under arbitrary covariance dependency

Unconfounded Meta-analytical Frameworks for Multivariate Outcomes in Multigroup Observational Studies using Concordant Weights

Efficient Algorithms for Covariate Analysis with Dynamic Data Using Nonlinear Mixed-Effects Model.

A powerful approach to test an optimally weighted combination of rare variants in admixed populations.

Improving the accuracy of two-sample summary-data Mendelian randomization: moving beyond the NOME assumption

A generalization of moderated statistics to data adaptive semiparametric estimation in high-dimensional biology

To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice