Creating a Medication Therapy Observational Research Database from an Electronic Medical Record: Challenges and Data Curation

Eberl,Sonja
DOI: https://doi.org/10.1055/s-0043-1777741
IF: 2.762
2024-02-09
Applied Clinical Informatics
Abstract:Background Observational research has shown its potential to complement experimental research and clinical trials by secondary use of treatment data from hospital care processes. It can also be applied to better understand pediatric drug utilization for establishing safer drug therapy. Clinical documentation processes often limit data quality in pediatric medical records requiring data curation steps, which are mostly underestimated. Objectives The objectives of this study were to transform and curate data from a departmental electronic medical record into an observational research database. We particularly aim at identifying data quality problems, illustrating reasons for such problems and describing the systematic data curation process established to create high-quality data for observational research. Methods Data were extracted from an electronic medical record used by four wards of a German university children's hospital from April 2012 to June 2020. A four-step data preparation, mapping, and curation process was established. Data quality of the generated dataset was firstly assessed following an established 3 × 3 Data Quality Assessment guideline and secondly by comparing a sample subset of the database with an existing gold standard. Results The generated dataset consists of 770,158 medication dispensations associated with 89,955 different drug exposures from 21,285 clinical encounters. A total of 6,840 different narrative drug therapy descriptions were mapped to 1,139 standard terms for drug exposures. Regarding the quality criterion correctness, the database was consistent and had overall a high agreement with our gold standard. Conclusion Despite large amounts of freetext descriptions and contextual knowledge implicitly included in the electronic medical record, we were able to identify relevant data quality issues and to establish a semi-automated data curation process leading to a high-quality observational research database. Because of inconsistent dosage information in the original documentation this database is limited to a drug utilization database without detailed dosage information. A positive ethics vote exists for the study (Application No 561_20 BC, ethics commission of the Friedrich-Alexander-Universität chaired by Prof. Dr. med. Renke Maas). The [Supplementary Material 1] (available in online version only) is a detailed description of the thorough intra-database quality assessment. The [Supplementary Material 2] (available in online version only) is a detailed description of the thorough extra-database quality assessment. The data that support the findings of this study are available from the senior author S.E. (Sonja.eberl@uk-erlangen.de), upon reasonable request. Received: 26 April 2023 Accepted: 28 August 2023 Article published online: 07 February 2024 © 2024. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/) Georg Thieme Verlag KG Rüdigerstraße 14, 70469 Stuttgart, Germany
medical informatics
What problem does this paper attempt to address?