MFS Enhanced SAM: Achieving Superior Performance in Bimodal Few-Shot Segmentation

Ying Zhao,Kechen Song,Wenqi Cui,Hang Ren,Yunhui Yan
DOI: https://doi.org/10.1016/j.jvcir.2023.103946
IF: 2.887
2023-01-01
Journal of Visual Communication and Image Representation
Abstract:Recently, Segment Anything Model (SAM) has become popular in computer vision field because of its powerful image segmentation ability and high interactivity of various prompts, which opens a new era of large vision foundation models. But is SAM really omnipotent? In this letter, we establish a comprehensive bimodal few-shot segmentation indoor dataset VT-840-5i, and compare SAM with eight state-of-the-art few-shot segmentation (FSS) methods on two benchmark datasets. Qualitative and quantitative experiment results show that although SAM is very effective in general object segmentation, it still has room for improvement in some challenging scenarios. Therefore, we introduce thermal infrared auxiliary information into the segmentation task and provide multiple fusion strategies (MFS) for readers to choose the most suitable approach for the specific task. Finally, we discuss several potential research trends about SAM in the future. Our test results are available at: https://github. com/VDT-2048/Bi-SAM.
What problem does this paper attempt to address?