Abstract:The use of a standards-based, modular architecture for the development of phenotype algorithms enhances the interoperability of electronic health record (EHR) systems to allow the dissemination of algorithms across institutions. Here we describe the implementation of previously proposed modules of a comprehensive solution for the development, validation, execution and dissemination of EHR-driven phenotype algorithms. Introduction: The use of electronic health records (EHRs) for research has been a focus of the biomedical informatics research community, with many researchers and consortia describing methodologies for effective use of EHR data, as well as challenges discovered along the way[1, 2]. To aid in the development of phenotype algorithms using clinical data, several software solutions have been provided to the informatics community, including the Informatics for Integrating Biology and the Bedside (i2b2)[3] and Observational Health Data Sciences and Informatics (OHDSI)[4]. We previously described the Phenotype Execution and Modeling Architecture (PhEMA) – a modular software architecture that relies on components that interoperate using standard formats and interfaces[5]. These components are logically separated to complete a specific task, such as executing an algorithm and collecting the results. Having referenced existing systems in development of our proposed architecture, we noted limitations and gaps that we sought to address –specifically around increasing the use of standards and providing flexibility in configuring components to meet each institution’s needs. Methods: The PhEMA development team has identified available software systems for many of the proposed seven architecture components (Library for Artifacts, Authoring, Clinical Data Repository, Execution, Validation, Data Model Services and Terminology Services) for EHR phenotyping, and developed new software in the absence of existing solutions. During development, we designed the systems around concrete interfaces and specifications, but were agnostic to the choice of a particular programming language or development environment. Results: The PhEMA solution includes one or more implemented components, as shown in Figure 1. A demonstration system and source code are available from the project website (http://projectphema.org). Briefly, each of the implemented solutions is as follows: Terminology Services – Our use of the Quality Data Model (QDM) relies on value sets (collections of terms to represent concepts, derived from standard vocabularies). We not only provide users with read access to the NLM-hosted Value Set Authority Center (VSAC) for existing value sets, but also provide a separate read/write instance of a repository for custom value sets. Both repositories leverage the Common Terminology Services 2 (CTS2) standard[6] (the VSAC CTS2 service utilizes a CTS2 wrapper [VSMC]). The authoring tool may be configured to use one or both repositories during installation. Figure 1. Implemented components of the Phenotype Execution and Modeling Architecture (PhEMA). Blue boxes indicate newly developed software, while white are existing solutions.

The Phenotype Execution and Modeling Architecture ( PhEMA ) – A Standards-Based Composition of Software for Phenotype Algorithm Development

A Modular Architecture for Electronic Health Record-Driven Phenotyping

Desiderata for Computable Representations of Electronic Health Records-Driven Phenotype Algorithms

Design and Validation of a FHIR-based EHR-driven Phenotyping Toolbox

Developing a Data Element Repository to Support EHR-driven Phenotype Algorithm Authoring and Execution

Towardcross-Platformelectronic Health Record-Drivenphenotyping Using Clinical Quality Language

Usability of a Phenotype Builder Prototype and Lessons Learned for the Design of Phenotyping Tools

Review and evaluation of electronic health records-driven phenotype algorithm authoring tools for clinical and translational research.

Ontological representation, classification and data-driven computing of phenotypes

Creating a next-generation phenotype library: the health data research UK Phenotype Library

A general framework for developing computable clinical phenotype algorithms

A metadata framework for computational phenotypes

PheNominal: an EHR-integrated web application for structured deep phenotyping at the point of care

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)

Evaluation of Semantic Web Technologies for Storing Computable Definitions of Electronic Health Records Phenotyping Algorithms

Evaluating openEHR for storing computable representations of electronic health record phenotyping algorithms

Large Language Models Facilitate the Generation of Electronic Health Record Phenotyping Algorithms

The Unified Phenotype Ontology (uPheno): A framework for cross-species integrative phenomics

Electronic Health Record Phenotyping with Internally Assessable Performance (PhIAP) using Anchor-Positive and Unlabeled Patients

Towards a standard benchmark for variant and gene prioritisation algorithms: PhEval - Phenotypic inference Evaluation framework