Data Quality in Medical Real-World Data - An Oncological Use Case

Julia Gehrmann,Oya Beyan
DOI: https://doi.org/10.3233/SHTI240332
2024-08-22
Abstract:Data quality deficiencies significantly limit the applicability of real-world data in data-driven medical research. In this study, using an oncological use case, we report and discuss common quality deficiencies in real-world medical datasets, such as missing data, class imbalances, and timeliness issues. We compiled a multi-departmental real-world dataset comprising 13861 cancer cases diagnosed at University Hospital Cologne and examined data quality throughout the data integration process.
What problem does this paper attempt to address?