Prompting to Adapt Foundational Segmentation Models
Jie Hu,Jie Li,Yue Ma,Liujuan Cao,Songan Zhang,Wei Zhang,Guannan Jiang,Rongrong Ji
DOI: https://doi.org/10.1145/3664647.3680884
2024-01-01
Abstract:Foundational segmentation models, predominantly trained on scenes typical of natural environments, struggle to generalize across varied image domains. Traditional "training-to-adapt'' methods rely heavily on extensive data retraining and model architectures modifications. This significantly limits the models' generalization capabilities and efficiency in deployment. In this study, we propose a novel adaptation paradigm, termed "prompting-to-adapt'', to tackle the above issue by introducing an innovative image prompter. This prompter generates domain-specific prompts through few-shot image-mask pairs, incorporating diverse image processing techniques to enhance adaptability. To tackle the inherent non-differentiability of image prompts, we further devise an information-estimation-based gradient descent strategy that leverages the information entropy of image processing combinations to optimize the prompter, ensuring effective adaptation. Through extensive experiments across nine datasets spanning seven image domains (i.e., depth, thermal, camouflage, endoscopic, ultrasound, grayscale, and natural) and four scenarios (i.e., common scenes, camouflage objects, medical images, and industrial data), we demonstrate that our approach significant improves the foundational models' adaptation capabilities. Moreover, the interpretability of the generated prompts provides insightful revelations into their image processing mechanisms. Source code is available at: \urlgithub.com/yuema1303/Prompting-to-Adapt-FSM.