Application of high resolution computed tomography image assisted classification model of middle ear diseases based on 3D-convolutional neural network
Ri Su,Jian Song,Zheng Wang,Shuang Mao,Yitao Mao,Xuewen Wu,Muzhou Hou
DOI: https://doi.org/10.11817/j.issn.1672-7347.2022.210704
2022-08-28
Abstract:Objectives: Chronic suppurative otitis media (CSOM) and middle ear cholesteatoma (MEC) are the 2 most common chronic middle ear diseases. In the process of diagnosis and treatment, the 2 diseases are prone to misdiagnosis and missed diagnosis due to their similar clinical manifestations. High resolution computed tomography (HRCT) can clearly display the fine anatomical structure of the temporal bone, accurately reflect the middle ear lesions and the extent of the lesions, and has advantages in the differential diagnosis of chronic middle ear diseases. This study aims to develop a deep learning model for automatic information extraction and classification diagnosis of chronic middle ear diseases based on temporal bone HRCT image data to improve the classification and diagnosis efficiency of chronic middle ear diseases in clinical practice and reduce the occurrence of missed diagnosis and misdiagnosis. Methods: The clinical records and temporal bone HRCT imaging data for patients with chronic middle ear diseases hospitalized in the Department of Otorhinolaryngology, Xiangya Hospital from January 2018 to October 2020 were retrospectively collected. The patient's medical records were independently reviewed by 2 experienced otorhinolaryngologist and the final diagnosis was reached a consensus. A total of 499 patients (998 ears) were enrolled in this study. The 998 ears were divided into 3 groups: an MEC group (108 ears), a CSOM group (622 ears), and a normal group (268 ears). The Gaussian noise with different variances was used to amplify the samples of the dataset to offset the imbalance in the number of samples between groups. The sample size of the amplified experimental dataset was 1 806 ears. In the study, 75% (1 355) samples were randomly selected for training, 10% (180) samples for validation, and the remaining 15% (271) samples for testing and evaluating the model performance. The overall design for the model was a serial structure, and the deep learning model with 3 different functions was set up. The first model was the regional recommendation network algorithm, which searched the middle ear image from the whole HRCT image, and then cut and saved the image. The second model was image contrast convolutional neural network (CNN) based on twin network structure, which searched the images matching the key layers of HRCT images from the cut images, and constructed 3D data blocks. The third model was based on 3D-CNN operation, which was used for the final classification and diagnosis of the 3D data block construction, and gave the final prediction probability. Results: The special level search network based on twin network structure showed an average AUC of 0.939 on 10 special levels. The overall accuracy of the classification network based on 3D-CNN was 96.5%, the overall recall rate was 96.4%, and the average AUC under the 3 classifications was 0.983. The recall rates of CSOM cases and MEC cases were 93.7% and 97.4%, respectively. In the subsequent comparison experiments, the average accuracy of some classical CNN was 79.3%, and the average recall rate was 87.6%. The precision rate and the recall rate of the deep learning network constructed in this study were about 17.2% and 8.8% higher than those of the common CNN. Conclusions: The deep learning network model proposed in this study can automatically extract 3D data blocks containing middle ear features from the HRCT image data of patients' temporal bone, which can reduce the overall size of the data while preserve the relationship between corresponding images, and further use 3D-CNN for classification and diagnosis of CSOM and MEC. The design of this model is well fitting to the continuous characteristics of HRCT data, and the experimental results show high precision and adaptability, which is better than the current common CNN methods.