Evaluating the Clinical Utility of Artificial Intelligence Assistance and its Explanation on Glioma Grading Task

Weina Jin,Mostafa Fatehi,Ru Guo,Ghassan Hamarneh
DOI: https://doi.org/10.1101/2022.12.07.22282726
2022-12-10
MedRxiv
Abstract:As a fast-advancing technology, artificial intelligence (AI) has considerable potential to assist physicians in various clinical tasks from disease identification to lesion segmentation. Despite much research, AI has not yet been applied to neuro-oncological imaging in a clinically meaningful way. To bridge the clinical implementation gap of AI in neuro-oncological settings, we conducted a clinical user-based evaluation, analogous to the phase II clinical trial, to evaluate the utility of AI for diagnostic predictions and the value of AI explanations on the glioma grading task. Using the publicly-available BraTS dataset, we trained an AI model of 88.0% accuracy on the glioma grading task. We selected the SmoothGrad explainable AI algorithm based on the computational evaluation regarding explanation truthfulness among a candidate of 16 commonly-used algorithms. SmoothGrad could explain the AI model's prediction using a heatmap overlaid on the MRI to highlight important regions for AI prediction. The evaluation is an online survey wherein the AI prediction and explanation are embedded. Each of the 35 neurosurgeon participants read 25 brain MRI scans of patients with gliomas, and gave their judgment on the glioma grading without and with the assistance of AI's prediction and explanation. Compared to the average accuracy of 82.5{+/-}8.7% when physicians perform the task alone, physicians' task performance increased to 87.7{+/-}7.3% with statistical significance (p-value = 0.002) when assisted by AI prediction, and remained at almost the same level of 88.5{+/-}7.0% (p-value = 0.35) with the additional AI explanation assistance. The evaluation shows the clinical utility of AI to assist physicians on the glioma grading task. It also reveals the limitations of applying existing AI explanation techniques in clinical settings.
What problem does this paper attempt to address?