Abstract:The use of a standards-based, modular architecture for the development of phenotype algorithms enhances the interoperability of electronic health record (EHR) systems to allow the dissemination of algorithms across institutions. Here we describe the implementation of previously proposed modules of a comprehensive solution for the development, validation, execution and dissemination of EHR-driven phenotype algorithms. Introduction: The use of electronic health records (EHRs) for research has been a focus of the biomedical informatics research community, with many researchers and consortia describing methodologies for effective use of EHR data, as well as challenges discovered along the way[1, 2]. To aid in the development of phenotype algorithms using clinical data, several software solutions have been provided to the informatics community, including the Informatics for Integrating Biology and the Bedside (i2b2)[3] and Observational Health Data Sciences and Informatics (OHDSI)[4]. We previously described the Phenotype Execution and Modeling Architecture (PhEMA) – a modular software architecture that relies on components that interoperate using standard formats and interfaces[5]. These components are logically separated to complete a specific task, such as executing an algorithm and collecting the results. Having referenced existing systems in development of our proposed architecture, we noted limitations and gaps that we sought to address –specifically around increasing the use of standards and providing flexibility in configuring components to meet each institution’s needs. Methods: The PhEMA development team has identified available software systems for many of the proposed seven architecture components (Library for Artifacts, Authoring, Clinical Data Repository, Execution, Validation, Data Model Services and Terminology Services) for EHR phenotyping, and developed new software in the absence of existing solutions. During development, we designed the systems around concrete interfaces and specifications, but were agnostic to the choice of a particular programming language or development environment. Results: The PhEMA solution includes one or more implemented components, as shown in Figure 1. A demonstration system and source code are available from the project website (http://projectphema.org). Briefly, each of the implemented solutions is as follows: Terminology Services – Our use of the Quality Data Model (QDM) relies on value sets (collections of terms to represent concepts, derived from standard vocabularies). We not only provide users with read access to the NLM-hosted Value Set Authority Center (VSAC) for existing value sets, but also provide a separate read/write instance of a repository for custom value sets. Both repositories leverage the Common Terminology Services 2 (CTS2) standard[6] (the VSAC CTS2 service utilizes a CTS2 wrapper [VSMC]). The authoring tool may be configured to use one or both repositories during installation. Figure 1. Implemented components of the Phenotype Execution and Modeling Architecture (PhEMA). Blue boxes indicate newly developed software, while white are existing solutions.

A metadata framework for computational phenotypes

Desiderata for Computable Representations of Electronic Health Records-Driven Phenotype Algorithms

The Phenotype Execution and Modeling Architecture ( PhEMA ) – A Standards-Based Composition of Software for Phenotype Algorithm Development

Towardcross-Platformelectronic Health Record-Drivenphenotyping Using Clinical Quality Language

Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research.

Design and Validation of a FHIR-based EHR-driven Phenotyping Toolbox

A Modular Architecture for Electronic Health Record-Driven Phenotyping

Creating a next-generation phenotype library: the health data research UK Phenotype Library

Ontological representation, classification and data-driven computing of phenotypes

Usability of a Phenotype Builder Prototype and Lessons Learned for the Design of Phenotyping Tools

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)

A general framework for developing computable clinical phenotype algorithms

The Unified Phenotype Ontology (uPheno): A framework for cross-species integrative phenomics

Developing a portable natural language processing based phenotyping system

Ontologizing health systems data at scale: making translational discovery a reality

Knowledge-Driven Online Multimodal Automated Phenotyping System

High-throughput multimodal automated phenotyping (MAP) with application to PheWAS

Developing a Data Element Repository to Support EHR-driven Phenotype Algorithm Authoring and Execution

A novel framework for assessing metadata quality in epidemiological and public health research settings

Identification of Genetic Elements in Metabolism by High-Throughput Mouse Phenotyping

Perceptual and technical barriers in sharing and formatting metadata accompanying omics studies