A Precision Environment-Wide Association Study of Hypertension via Supervised Cadre Models

Alexander New,Kristin P. Bennett
DOI: https://doi.org/10.1109/jbhi.2019.2918070
IF: 7.7
2020-03-01
IEEE Journal of Biomedical and Health Informatics
Abstract:We consider the problem in precision health of grouping people into subpopulations based on their degree of vulnerability to a risk factor. These subpopulations cannot be discovered with traditional clustering techniques because their quality is evaluated with a supervised metric: The ease of modeling a response variable for observations within them. Instead, we apply the more appropriate supervised cadre model (SCM). We extend the SCM formalism so that it may be applied to multivariate regression and binary classification problems and develop a way to use conditional entropy to assess the confidence in the process by which a subject is assigned their cadre. Using the SCM, we generalize the environment-wide association study (EWAS) to be able to model heterogeneity in population risk. In our EWAS, we consider more than 200 environmental exposure factors and find their association with diastolic blood pressure, systolic blood pressure, and hypertension. This requires adapting the SCM to be applicable to data generated by a complex survey design. After correcting for false positives, we found 25 exposure variables that had a significant association with at least one of our response variables. Eight of these were significant for a discovered subpopulation but not for the overall population. Some of these associations have been identified by previous researchers, whereas others appear to be novel. We examine discovered subpopulations in detail, finding that they are interpretable and suggestive of further research questions.
computer science, interdisciplinary applications,mathematical & computational biology,medical informatics, information systems
What problem does this paper attempt to address?