Ultrasound image analysis using deep neural networks for discriminating between benign and malignant ovarian tumors: comparison with expert subjective assessment

F. Christiansen,E. L. Epstein,E. Smedberg,M. Åkerlund,K. Smith,E. Epstein
DOI: https://doi.org/10.1002/uog.23530
2021-01-01
Ultrasound in Obstetrics and Gynecology
Abstract:<section class="article-section__content"><h3 class="article-section__sub-title section1"> Objectives</h3><p>To develop and test computerized ultrasound image analysis using deep neural networks (DNNs) to discriminate benign and malignant ovarian tumours, and to compare the diagnostic accuracy with subjective assessment (SA) by ultrasound experts.</p></section><section class="article-section__content"><h3 class="article-section__sub-title section1"> Methods</h3><p>We included 3077 (grayscale n=1927, power Doppler n=1150) ultrasound images from 758 women with ovarian tumours, prospectively classified by expert ultrasound examiners according to IOTA (International Ovarian Tumor Analysis). Histological outcome from surgery (n=634) or long‐time (<span>&gt;</span> 3 years) follow‐up (n=124) served as gold standard. The dataset was split into a training set (n=508; 314 benign, 194 malignant), a validation set (n=100; 60 benign, 40 malignant) and a test set (n=150; 75 benign, 75 malignant). We used transfer learning on three pre‐trained DNNs: VGG16, ResNet50 and MobileNet. Each model was trained, and the outputs calibrated using temperature scaling. An ensemble of the three models was then used to estimate the probability of malignancy based on all images from a given case. Using DNNs, tumours were classified as benign or malignant (Ovry‐Dx1); or benign, inconclusive or malignant (Ovry‐Dx2). The DNNs were compared to SA based on sensitivity and specificity on the test set. </p></section><section class="article-section__content"><h3 class="article-section__sub-title section1"> Results</h3><p>At the same sensitivity (96.0%), the specificity of Ovry‐Dx1 (86.7%) and SA (88.0%) were not significantly different, p=1.0. Ovry‐Dx2 had a sensitivity of 97.1% and a specificity of 93.7%, when designating 12.7% of the lesions as inconclusive. By complimenting Ovry‐Dx2 with SA in inconclusive cases, the overall sensitivity (96.0%) and specificity (89.3%) were not significantly different from using SA in all cases, p=1.0.</p></section><section class="article-section__content"><h3 class="article-section__sub-title section1"> Conclusions</h3><p>Ultrasound image analysis using DNNs can predict ovarian malignancy with a diagnostic accuracy comparable to human expert examiners, indicating that these models may have a role in the triage of women with ovarian tumours.</p><p>This article is protected by copyright. All rights reserved.</p></section>
radiology, nuclear medicine & medical imaging,obstetrics & gynecology,acoustics
What problem does this paper attempt to address?