Commentary: "No Genetic Causality between Tobacco Smoking and Venous Thromboembolism: A Two-Sample Mendelian Randomization Study"

Jinhua Liu,Youqian Zhang,Bo Zeng
DOI: https://doi.org/10.1055/s-0044-1787653
2024-06-08
Thrombosis and Haemostasis
Abstract:Mendelian randomization (MR) is a powerful method for causal inference that uses genetic variants as instrumental variables (IVs) and has gained popularity with the increase in genome-wide association studies (GWAS) in recent years. Recently, Du et al conducted an MR analysis to explore the causal relationship between smoking and venous thromboembolism (VTE), including pulmonary embolism (PE) and deep vein thrombosis (DVT), in European populations.[1] Interestingly, the authors found no causal association between the two, which contrasts sharply with a previous MR study that indicated that genetic susceptibility to smoking is associated with an increased incidence of DVT and PE.[2] Upon careful review, the inconsistencies in the results can be attributed to several limitations, including an incomplete consideration of smoking phenotypes, sample overlap, and the use of outdated databases. Consequently, the conclusions of this study should be interpreted with caution. In the MR framework, obtaining a robust causal association requires strict adherence to three fundamental assumptions, as well as consideration of biases caused by weak IVs, sample overlap, and inadequate statistical power. A major issue in this study is the failure to account for sample overlap. Both smoking and DVT data were sourced from the UK Biobank (UKB), with sample overlap rates reaching as high as 92 and 99%, respectively. Such high overlap rates can increase the risk of type I errors and the winner's curse, adversely affecting the accuracy of causal inference.[3] Therefore, conducting additional sensitivity analyses or using nonoverlapping cohorts is necessary. Recently, a new method designed to address sample overlap—MRLap,[4] which uses cross-trait linkage disequilibrium score regression intercepts for correction—has shown good fit in simulations with 5 to 95% overlap. A second major issue in this study is the choice of data, including the use of outdated datasets and an incomplete consideration of the exposure phenotype. The study utilized the 2021 release (R5) of the FinnGen consortium for DVT and PE data, despite the availability of the more recent R9 version at the time. Employing the latest datasets could provide a greater number of genetic variants, enhance the strength of the IVs, and include more accurate environmental and demographic information, thereby improving the accuracy and reliability of causal inference. Notably, the impact of using outdated data and resulting discrepancies has been validated in other MR analyses.[5] [6] The exposure phenotypes in the study were sourced from the UKB and included only current and past smoking status, which significantly lacks in capturing the multidimensional characteristics of smoking behavior. A more comprehensive approach would be to initially consider smoking phenotypes provided by the GWAS and Sequencing Consortium of Alcohol and Nicotine use (GSCAN) and supplemented by the UKB, which would include but not be limited to smoking intensity, frequency, duration, and cessation behaviors.[7] These aspects are crucial for assessing the impact of smoking on health ([Table 1]). Therefore, the authors' assertion that there is no causal link between smoking and VTE risk is premature and one-sided. Phenotype Definition/diagnostics Ref. Consortium Ancestry Participants Pack-years Total number of pack-years smoked in adulthood 36402876 UKB EUR 142,387 individuals CigDay (past and current) Cigarettes smoked per day by current smokers only or current and former smokers combined 30643251 GSCAN EUR 337,334 individuals CigDay (current) Number of cigarettes currently smoked daily 36402876 UKB EUR 33,229 individuals Lifetime smoking Measures the cumulative effect on smoking on health outcomes by encompassing the initiation, duration, intensity, and time since stopping smoking 31689377 UKB EUR 462,690 individuals SmkInit An individual has ever smoked regularly (yes/no) 30643251 GSCAN EUR 1,232,091 individuals Abbreviations: CigDay, cigarettes per day; EUR, European; GSCAN, GWAS and Sequencing Consortium of Alcohol and Nicotine use; SmkInit, smoking initiation; UKB, UK biobank. Statistical power is considered one of the main challenges in MR studies, as most genetic variants predict only a small fraction of the phenotypic variation.[8] A third major issue in this study is the neglect of power calculations. A recent study highlights the need for caution when excluding single-nucleotide polymorphisms (SNPs) to control for horizontal pleiotropy; if SNPs associated with confounding factors are crucial for the phenotype under investigation, their exclusion could inadvertently "reduce noise blindly," thereby weakening the detection capability and increasing the risk of type I errors.[9] While we appreciate the authors' efforts to address the independence assumption, the importance of statistical power must also be recognized. Moreover, traditional s -Abstract Truncated-
peripheral vascular disease,hematology
What problem does this paper attempt to address?