Characterizing Phenotypic Abnormalities Associated with High-Risk Individuals Developing Lung Cancer Using Electronic Health Records from the All of Us Researcher Workbench

Jie Na,Nansu Zong,Chen Wang,David E. Midthun,Yuan Luo,Ping Yang,Guoqian Jiang
DOI: https://doi.org/10.1093/jamia/ocab174
2021-01-01
Journal of the American Medical Informatics Association
Abstract:OBJECTIVE The study sought to test the feasibility of conducting a phenome-wide association study to characterize phenotypic abnormalities associated with individuals at high risk for lung cancer using electronic health records. MATERIALS AND METHODS We used the beta release of the All of Us Researcher Workbench with clinical and survey data from a population of 225 000 subjects. We identified 3 cohorts of individuals at high risk to develop lung cancer based on (1) the 2013 U.S. Preventive Services Task Force criteria, (2) the long-term quitters of cigarette smoking criteria, and (3) the younger age of onset criteria. We applied the logistic regression analysis to identify the significant associations between individuals' phenotypes and their risk categories. We validated our findings against a lung cancer cohort from the same population and conducted an expert review to understand whether these associations are known or potentially novel. RESULTS We found a total of 214 statistically significant associations (P < .05 with a Bonferroni correction and odds ratio > 1.5) enriched in the high-risk individuals from 3 cohorts, and 15 enriched in the low-risk individuals. Forty significant associations enriched in the high-risk individuals and 13 enriched in the low-risk individuals were validated in the cancer cohort. Expert review identified 15 potentially new associations enriched in the high-risk individuals. CONCLUSIONS It is feasible to conduct a phenome-wide association study to characterize phenotypic abnormalities associated in high-risk individuals developing lung cancer using electronic health records. The All of Us Research Workbench is a promising resource for the research studies to evaluate and optimize lung cancer screening criteria.
What problem does this paper attempt to address?