MetaMask: Improving Few-Shot Semantic Segmentation Via Multi-Mask Calibriation

Li Dinghang,Zongqing Lu,Weiliang Zheng,Qingmin Liao,Fan Lyu
DOI: https://doi.org/10.1109/ijcnn60899.2024.10651371
2024-01-01
Abstract:Few-shot Semantic Segmentation (FSS) aims to develop models that can segment previously unseen classes with only a few annotations. Recent approaches employ a "multi-mask" framework, which initially generates various mask proposals from query images and then matches related mask proposals to get the final output guided by support images. Despite its promise, this framework is limited by the quality of mask proposals for unseen classes and a naive mask matching process. To address such limitations, in this paper, we propose a meta-learning-based method called MetaMask. First, MetaMask builds a Support-Guided Latent Object Segmenter (SG-LOS) module, which incorporates unseen class information into mask proposal generation for query images, where episodic training is used to enhance mask generation for latent unseen classes. Second, MetaMask improves the mask-matching mechanism through our proposed Contrastive Mask Matching (CMM) module with a cross-image multi-level contrastive learning strategy, bolstering feature embedding spaces. Our method shows competitive results on two main benchmarks: 69.9% mIoU on Pascal-5 i one-shot setting and 49.6% mIoU COCO-20 i one-shot setting, marginally outperforming our baseline by 6.6% and 5.4%, setting a new state-of-the-art on the both Pascal-5 i and COCO-20 i datasets.
What problem does this paper attempt to address?