Bridging health registry data acquisition and real-time data analytics

Johannes Schmidt,Sita Arjune,Volker Boehm,Roman-Ulrich Mueller,Philipp Antczak
DOI: https://doi.org/10.1101/2024.06.12.24308496
2024-06-14
Abstract:The number of clinical studies and associated research has increased significantly in the last few years. Particularly in rare diseases, an increased effort has been made to integrate, analyse, and develop new knowledge to improve patient stratification and wellbeing. Clinical databases, including digital medical records, hold significant amount of information that can help understand the impact and progression of diseases. Combining and integrating this data however, has provided a challenge for data scientists due to the complex structures of digital medical records and the lack of site wide standardisation of data entry. To address these challenges we present a python backed tool, Meda, which aims to collect data from different sources and combines these in a unified database structure for near real-time monitoring of clinical data. Together with an R shiny interface we can provide a near complete platform for real-time analysis and visualization.
What problem does this paper attempt to address?
The main problem this paper attempts to address is the gap between clinical data integration and real-time analysis, particularly in rare disease research. With the increasing number of clinical studies, especially in the field of rare diseases, there is a growing need to integrate, analyze, and develop new knowledge from data to improve patient stratification and well-being. However, the complex structure of digital medical records and the lack of standardized data entry methods pose challenges for data scientists. To tackle these challenges, the authors introduce a Python-based tool called Meda, which aims to collect data from different sources and integrate it into a unified database structure, enabling near real-time clinical data monitoring. Additionally, by combining the R Shiny interface, this tool can provide an almost complete real-time analysis and visualization platform. Specifically, the paper addresses the following key issues: 1. **Data Integration and Standardization**: The data structure in clinical databases is complex and lacks standardization, making data integration difficult. Meda provides a flexible and dynamic data transformation service that standardizes and integrates data from different sources into a centralized database. 2. **Data Quality Control**: In existing medical data systems, data quality control is usually performed when data is extracted for clinical research, making it difficult to detect data entry errors in a timely manner. Meda improves data quality through an automated data validation and threshold checking system. 3. **Real-Time Data Visualization**: Traditional clinical database designs do not support real-time data visualization, limiting the immediate application and analysis of data. Meda, combined with the R Shiny interface, enables real-time data visualization, allowing healthcare professionals to access and analyze patient information instantly. 4. **Data Management and Auditing**: Meda provides automated workflows, including data import, validation, error logging, and log management, ensuring data consistency and traceability. By addressing these issues, Meda aims to provide an efficient and reliable data management and analysis platform for clinical research and healthcare institutions.