A metadata framework for computational phenotypes

Matthew Spotnitz,Nripendra Acharya,James J Cimino,Shawn Murphy,Bahram Namjou,Nancy Crimmins,Theresa Walunas,Cong Liu,David Crosslin,Barbara Benoit,Elisabeth Rosenthal,Jennifer A Pacheco,Anna Ostropolets,Harry Reyes Nieva,Jason S Patterson,Lauren R Richter,Tiffany J Callahan,Ahmed Elhussein,Chao Pang,Krzysztof Kiryluk,Jordan Nestor,Atlas Khan,Sumit Mohan,Evan Minty,Wendy Chung,Wei-Qi Wei,Karthik Natarajan,Chunhua Weng
DOI: https://doi.org/10.1093/jamiaopen/ooad032
2023-04-06
JAMIA Open
Abstract:Abstract With the burgeoning development of computational phenotypes, it is increasingly difficult to identify the right phenotype for the right tasks. This study uses a mixed-methods approach to develop and evaluate a novel metadata framework for retrieval of and reusing computational phenotypes. Twenty active phenotyping researchers from 2 large research networks, Electronic Medical Records and Genomics and Observational Health Data Sciences and Informatics, were recruited to suggest metadata elements. Once consensus was reached on 39 metadata elements, 47 new researchers were surveyed to evaluate the utility of the metadata framework. The survey consisted of 5-Likert multiple-choice questions and open-ended questions. Two more researchers were asked to use the metadata framework to annotate 8 type-2 diabetes mellitus phenotypes. More than 90% of the survey respondents rated metadata elements regarding phenotype definition and validation methods and metrics positively with a score of 4 or 5. Both researchers completed annotation of each phenotype within 60 min. Our thematic analysis of the narrative feedback indicates that the metadata framework was effective in capturing rich and explicit descriptions and enabling the search for phenotypes, compliance with data standards, and comprehensive validation metrics. Current limitations were its complexity for data collection and the entailed human costs.
What problem does this paper attempt to address?