Ensemble of Multiscale Fine-Tuning Convolutional Neural Networks for Recognition of Benign and Malignant Thyroid Nodules

Liang Jiawei,Qiu Taorong,Zhou Aiyun,Xu Pan,Xie Xuemei,Fu Hao
DOI: https://doi.org/10.3724/sp.j.1089.2021.18254
2021-01-01
Journal of Computer-Aided Design & Computer Graphics
Abstract:Aiming at the problem of poor classification performance due to the small number of training image samples and ignoring multiscale structure and texture information, in order to improve the accuracy of diagnosis of benign and malignant thyroid nodule, this paper proposes a method for thyroid nodule ultrasound image recognition based on ensemble of multiscale fine-tuning convolutional neural networks. Firstly, the image is converted into three different scales of information as input data, so that the model can learn the feature information of different scales of the image, and improve the feature extraction ability of the model. Secondly, nine fine-tuning models of three different scales were constructed by optimizing the full-connection layer structure of three kinds of pretraining models (AlexNet, VGG16 and ResNet50) and the transfer learning and fine-tuning strategy, so that the model could better learn the characteristic differences of source domain (ImageNet) and target domain (thyroid ultrasound image). Finally, the optimal fine-tuning model combination is selected and the final integration model is obtained by the weighted fusion method of model output category probability, and the classification performance is further improved by utilizing the diversity of models. The proposed algorithm was compared with other algorithms on the real data set, and the accuracy, sensitivity, specificity and area under curve (AUC) of benign and malignant thyroid nodules were 96.0%, 94.1%, 97.7% and 0.98. The experimental results show that the algorithm is superior to the traditional machine learning algorithm and other algorithms in the field of benign and malignant thyroid nodule identification, and can effectively extract complementary visual feature information with satisfactory classification performance.
What problem does this paper attempt to address?