Vision Mamba for Classification of Breast Ultrasound Images

Ali Nasiri-Sarvi,Mahdi S. Hosseini,Hassan Rivaz
2024-09-17
Abstract:Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{<a class="link-external link-https" href="https://github.com/anasiri/BU-Mamba" rel="external noopener nofollow">this https URL</a>}
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The main objective of this paper is to evaluate and compare the performance of vision models based on the Mamba architecture (such as VMamba and Vim) with traditional Convolutional Neural Networks (CNN) and Vision Transformers (ViT) in the task of breast ultrasound image classification. Specifically, the researchers used two datasets—the BUSI dataset and the B dataset—and conducted multiple experimental runs to ensure the reliability of the results. Through statistical significance analysis (such as t-tests), the study found that the Mamba-based models significantly outperformed other models in certain cases and were not inferior to other models in any case. These models are capable of effectively capturing long-range dependencies while maintaining a certain inductive bias, making them particularly suitable for applications with limited data. Overall, this study aims to verify whether Mamba-based models can become effective tools for handling breast ultrasound image classification tasks.