Interactive Segmentation by Considering First-Click Intentional Ambiguity

Kangpeng Hu,Quansen Sun,Yinghui Sun,Tao Wang
DOI: https://doi.org/10.1145/3664647.3681091
2024-01-01
Abstract:Interactive segmentation task (IS) aims at taking into account the influence of user preferences on the basis of general semantic segmentation in order to obtain the specific target-of-interest. Given the fact that most of the related algorithms generate a single mask only, the robustness of which might be constrained due to the diversity of user intention in the early interaction stage, namely the vague selection of object part/whole object/adherent object, especially when there's only one click. To handle this, we propose a novel framework called Diversified Interactive Segmentation Network (DISNet) in which we revisit the peculiarity of first-click: given an input image, DISNet outputs multiple candidate masks under the guidance of first-click only, then a Dual-attentional Mask Correction (DAMC) module is utilized to measure the complex mutual effect within first-click, all-clicks and image features. Moreover, we design a new sampling strategy to generate GT masks with rich semantic relations. Performance analysis plus adequate ablation studies has demonstrated the efficacy of our methods, which further exemplifies the decisive role of first-click in the realm of IS.
What problem does this paper attempt to address?