Three-Dimensional and Explainable Deep Learning for Evaluation of Chronic Otitis Media Based on Temporal Bone Computed Tomography: Development Study (Preprint)
Binjun Chen,Yike Li,Yu Sun,Haojie Sun,Yanmei Wang,Jihan Lyu,Jiajie Guo,Yushu Cheng,Xun Niu,Lian Yang,Jianghong Xu,Juanmei Yang,Yibo Huang,Fanglu Chi,Bo Liang,Dongdong Ren
DOI: https://doi.org/10.2196/preprints.51706
2023-01-01
Abstract:BACKGROUND Computed tomography (CT) of the temporal bone has become a critical diagnostic approach to chronic otitis media (COM), but it requires training and experience for interpretation. Artificial intelligence may assist clinicians in evaluating COM using CT with efficiency and reliability, but the logic for decision-making can be incomprehensible and there is currently no model that makes full use of the multidimensional diagnostic information. OBJECTIVE This study was to develop an explainable and three-dimensional (3D) deep learning framework for detection and differential diagnosis of COM based on CT. METHODS Temporal bone CT scans were retrospectively obtained from patients receiving surgeries for COM between December 2015 and July 2021 at two independent institutes. The region of interest containing the middle ear was automatically segmented, followed by 3D convolutional neural networks trained to identify pathological ears and cholesteatoma. Gradient-weighted class activation mapping was used to generate heatmaps highlighting the critical regions for decision-making. Model performance was evaluated over five rounds of cross-validation and external validation and benchmarked against clinical experts. RESULTS The internal and the external datasets contained 1,661 patients (number of eligible ears[n]= 3,153) and 108 patients (n=211), respectively. The deep learning model achieved decent and comparable area under the receiver operating characteristic curve (AUROC) scores ([mean ± SD]: 0.96±0.01 and 0.93±0.01) and accuracies (87.8±1.7% and 84.3±1.5%) in detection of pathological ears on two datasets. Similar outcomes were also observed in identifying cholesteatoma, with AUROC of 0.85±0.03 and 0.83±0.05, and accuracies of 78.3±4.0% and 81.3±3.3%, respectively. The model exhibited equivalent or superior performance and a much higher consistency compared to experts’ averages in both tasks. The heatmaps properly highlighted the middle ear and mastoid regions, consistent with human knowledge in interpreting temporal bone CT. CONCLUSIONS This study suggests the feasibility of a 3D deep learning framework in automatic evaluation of COM using CT. This model demonstrates decent performance, generalizability, and transparency, making it a useful tool to assist clinicians in assessment of COM.