Segment Anything with Precise Interaction

Mengzhen Liu,Mengyu Wang,Henghui Ding,Yilong Xu,Yao Zhao,Yunchao Wei
DOI: https://doi.org/10.1145/3664647.3681470
2024-01-01
Abstract:Although the Segment Anything Model (SAM) has achieved impressive results in many segmentation tasks and benchmarks, its performance noticeably deteriorates when applied to high-resolution images for high-precision segmentation, limiting it's usage in many real-world applications.In this work, we explored transferring SAM into the domain of high-resolution images and proposed Pi-SAM. Compared to the original SAM and its variants, Pi-SAM demonstrates the following superiorities: Firstly, Pi-SAM possesses a strong perception capability for the extremely fine details in high-resolution images, enabling it to generate high-precision segmentation masks. As a result,Pi-SAM significantly surpasses previous methods in four high-resolution datasets. Secondly, Pi-SAM supports more precise user interactions. In addition to the native promptable ability of SAM, Pi-SAM allows users to interactively refine the segmentation predictions simply by clicking. While the original SAM fails to achieve this on high-resolution images. Thirdly, building upon SAM, Pi-SAM introduces very few additional parameters and computational costs and ensures highly efficient model fine-tuning to achieve the above performance.
What problem does this paper attempt to address?