Breast Lesion Diagnosis Using Static Images and Dynamic Video

Yunwen Huang,Hongyu Hu,Ying Zhu,Yi Xu
2023-08-19
Abstract:Deep learning based Computer Aided Diagnosis (CAD) systems have been developed to treat breast ultrasound. Most of them focus on a single ultrasound imaging modality, either using representative static images or the dynamic video of a real-time scan. In fact, these two image modalities are complementary for lesion diagnosis. Dynamic videos provide detailed three-dimensional information about the lesion, while static images capture the typical sections of the lesion. In this work, we propose a multi-modality breast tumor diagnosis model to imitate the diagnosing process of radiologists, which learns the features of both static images and dynamic video and explores the potential relationship between the two modalities. Considering that static images are carefully selected by professional radiologists, we propose to aggregate dynamic video features under the guidance of domain knowledge from static images before fusing multi-modality features. Our work is validated on a breast ultrasound dataset composed of 897 sets of ultrasound images and videos. Experimental results show that our model boosts the performance of Benign/Malignant classification, achieving 90.0% in AUC and 81.7% in accuracy.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively combine the information of two modalities, static images and dynamic videos, in breast ultrasound diagnosis to improve the accuracy of benign and malignant classification of breast lesions. At present, most computer - aided diagnosis (CAD) systems based on deep learning mainly focus on a single ultrasound imaging modality, that is, using representative static images or real - time scanned dynamic videos. However, these two modalities are complementary for lesion diagnosis: dynamic videos provide detailed three - dimensional information about the lesion, while static images capture the typical parts of the lesion. Therefore, this study proposes a multimodal breast tumor diagnosis model, aiming to mimic the diagnosis process of radiologists, by learning the features of static images and dynamic videos and exploring the potential relationships between them, thereby improving the diagnosis performance. Specifically, the main contributions of the paper include: 1. **Proposing a multimodal breast ultrasound tumor diagnosis framework**, which processes multimodal features from multiple static images and a single dynamic video, matching the actual diagnosis process. 2. **Designing an image - guided video feature enhancement and aggregation module**, embedding the domain knowledge in static images into dynamic videos to guide the extraction and aggregation of video features. 3. **Conducting extensive experiments on a real - world multimodal breast cancer dataset**, verifying the effectiveness of the proposed method. The results show that its performance in the benign / malignant classification task is better than the baseline model, achieving an AUC of 90.0% and an accuracy of 81.7%. Through these innovations, this study aims to improve the accuracy and efficiency of breast ultrasound diagnosis and reduce the workload of radiologists.