Scalable quality control on processing of large diffusion-weighted and structural magnetic resonance imaging datasets
Michael E. Kim,Chenyu Gao,Karthik Ramadass,Praitayini Kanakaraj,Nancy R. Newlin,Gaurav Rudravaram,Kurt G. Schilling,Blake E. Dewey,David A. Bennett,Sid OBryant,Robert C. Barber,Derek Archer,Timothy J. Hohman,Shunxing Bao,Zhiyuan Li,Bennett A. Landman,Nazirah Mohd Khairi,Alzheimers Disease Neuroimaging Initiative,HABSHD Study Team
2024-09-26
Abstract:Proper quality control (QC) is time consuming when working with large-scale medical imaging datasets, yet necessary, as poor-quality data can lead to erroneous conclusions or poorly trained machine learning models. Most efforts to reduce data QC time rely on outlier detection, which cannot capture every instance of algorithm failure. Thus, there is a need to visually inspect every output of data processing pipelines in a scalable manner. We design a QC pipeline that allows for low time cost and effort across a team setting for a large database of diffusion weighted and structural magnetic resonance images. Our proposed method satisfies the following design criteria: 1.) a consistent way to perform and manage quality control across a team of researchers, 2.) quick visualization of preprocessed data that minimizes the effort and time spent on the QC process without compromising the condition or caliber of the QC, and 3.) a way to aggregate QC results across pipelines and datasets that can be easily shared. In addition to meeting these design criteria, we also provide information on what a successful output should be and common occurrences of algorithm failures for various processing pipelines. Our method reduces the time spent on QC by a factor of over 20 when compared to naively opening outputs in an image viewer and demonstrate how it can facilitate aggregation and sharing of QC results within a team. While researchers must spend time on robust visual QC of data, there are mechanisms by which the process can be streamlined and efficient.
Distributed, Parallel, and Cluster Computing