Assessing the Quality of Electronic Data for 'Fit-for-Purpose' by Utilizing Data Profiling Techniques Prior to Conducting a Survival Analysis for Adults with Acute Lymphoblastic Leukemia

Victoria Ngo,Theresa H Keegan,Brian A Jonas,Michael Hogarth,Katherine K Kim
2021-01-25
Abstract:Acute lymphoblastic leukemia affects both children and adults. Rising costs of cancer care and patient burden contribute to the need to study factors influencing outcomes. This study explored the quality of datasets generated from a clinical research institution. The 'fit-for-use' of data prior to examining survival/complications was determined through a systematic approach guided by the Weiskopf et al. 3x3 Data Quality Assessment Framework. Constructs of completeness, correctness, and currency were explored for the data dimensions of patient, variables, and time. There were 11 types of data retrieved. Sufficient data points were found for patient and variable data in each dataset (≥70% of its cells filled with patient level data). Although there was concordance between variables, we found the distribution of lab values and death data to be incorrect. There were missing values for labs ordered and death dates. Our study showed that datasets retrieved can vary, even from the same institution.
What problem does this paper attempt to address?