High dimensional mediation analysis with latent variables

Andriy Derkach,Ruth M. Pfeiffer,Ting‐Huei Chen,Joshua N. Sampson
DOI: https://doi.org/10.1111/biom.13053
IF: 1.701
2019-05-05
Biometrics
Abstract:We propose a model for high dimensional mediation analysis that includes latent variables. We describe our model in the context of an epidemiologic study for incident breast cancer with a main exposure and a large number of biomarkers (i.e. potential mediators). We assume that the exposure directly influences a group of latent, or unmeasured, factors which are associated with both the outcome and a subset of the biomarkers. The biomarkers associated with the latent factors linking the exposure to the outcome are considered mediators". We derive the likelihood for this model and develop an expectation‐maximization (EM) algorithm to maximize an L1‐penalized version of this likelihood to limit the number of factors and associated biomarkers. We show that the resulting estimates are consistent and that the estimates of the non‐zero parameters have an asymptotically normal distribution. In simulations, procedures based on this new model can have significantly higher power for detecting mediating biomarkers compared to simpler approaches. We apply our method to a study that evaluates the relationship between body mass index, 481 metabolic measurements and estrogen‐receptor positive breast cancer.This article is protected by copyright. All rights reserved
statistics & probability,mathematical & computational biology,biology
What problem does this paper attempt to address?