Abstract:The traditional definition of co-salient object detection (CoSOD) task is to segment the common salient objects in a group of relevant images. This definition is based on an assumption of group consensus consistency that is not always reasonable in the open-world setting, which results in robustness issue in the model when dealing with irrelevant images in the inputting image group under the open-word scenarios. To tackle this problem, we introduce a group selective exchange-masking (GSEM) approach for enhancing the robustness of the CoSOD model. GSEM takes two groups of images as input, each containing different types of salient objects. Based on the mixed metric we designed, GSEM selects a subset of images from each group using a novel learning-based strategy, then the selected images are exchanged. To simultaneously consider the uncertainty introduced by irrelevant images and the consensus features of the remaining relevant images in the group, we designed a latent variable generator branch and CoSOD transformer branch. The former is composed of a vector quantised-variational autoencoder to generate stochastic global variables that model uncertainty. The latter is designed to capture correlation-based local features that include group consensus. Finally, the outputs of the two branches are merged and passed to a transformer-based decoder to generate robust predictions. Taking into account that there are currently no benchmark datasets specifically designed for open-world scenarios, we constructed three open-world benchmark datasets, namely OWCoSal, OWCoSOD, and OWCoCA, based on existing datasets. By breaking the group-consistency assumption, these datasets provide effective simulations of real-world scenarios and can better evaluate the robustness and practicality of models.

What problem does this paper attempt to address?

### The Problem Addressed by the Paper The paper aims to address the robustness issue in the task of co-salient object detection (CoSOD) in open-world environments. Traditionally, CoSOD is defined as segmenting common salient objects from a set of related images. This definition is based on the assumption that all images contain the same salient object. However, in open-world environments, this assumption is not always reasonable, as the input image set may contain unrelated images, leading to robustness issues when the model processes these irrelevant images. Specifically, when the test image set contains images without co-salient objects, traditional CoSOD models often generate false positive predictions. This limits the effectiveness of CoSOD models in practical applications, especially in open-world scenarios where the input images may include irrelevant ones. To overcome this issue, the authors propose a Group Selection Exchange Mask (GSEM) method based on generating uncertainty to enhance the robustness of CoSOD models. GSEM simulates real-world open-world conditions by introducing "noise images" and selects the most challenging images for exchange through a designed metric method, thereby improving the model's robustness. Additionally, the authors design a parallel feature extraction mechanism, including an LVGB branch for generating uncertainty features and a CoSOD Transformer branch for capturing intra-group consistency, to better handle uncertainty and consistency information in open-world environments. In summary, the main contributions of the paper are: 1. Designing a CoSOD model learning framework suitable for open-world scenarios using the GSEM strategy. 2. Proposing a parallel feature extraction mechanism that combines LVGB and CoSOD-TB to capture uncertainty and intra-group consistency information, respectively. 3. Reconstructing three benchmark datasets (OWCoSal, OWCoSOD, OWCoCA) to better evaluate the model's robustness and practicality in open-world scenarios. 4. Validating the effectiveness of the proposed method through extensive experiments, particularly demonstrating superior performance on open-world datasets compared to existing methods.

Towards Open-World Co-Salient Object Detection with Generative Uncertainty-aware Group Selective Exchange-Masking

Co-Salient Object Detection with Uncertainty-Aware Group Exchange-Masking

In Defense Of Multi-Source Omni-Supervised Efficient Convnet For Robust Semantic Segmentation In Heterogeneous Unseen Domains

Generalised Co-Salient Object Detection

Re-thinking Co-Salient Object Detection

Co-Salient Object Detection with Semantic-Level Consensus Extraction and Dispersion

Vaccination of pediatric nurses with live attenuated cytomegalovirus.

Exploring Diverse Representations for Open Set Recognition

Joint Salient Object Detection and Camouflaged Object Detection via Uncertainty-aware Learning

Co-salient Object Detection with Iterative Purification and Predictive Optimization

UDNet: Uncertainty-aware Deep Network for Salient Object Detection

USD: Unknown Sensitive Detector Empowered by Decoupled Objectness and Segment Anything Model

Recalling Unknowns without Losing Precision: An Effective Solution to Large Model-Guided Open World Object Detection

Evolution, maturation, and regression of lesions of lichen planus: New observations and correlations of clinical and histologic findings

Open-set object detection: towards unified problem formulation and benchmarking

Memory-aided Contrastive Consensus Learning for Co-salient Object Detection

Discriminative Consensus Mining with A Thousand Groups for More Accurate Co-Salient Object Detection

Co-saliency Detection with Intra-Group Two-Stage Group Semantics Propagation and Inter-Group Contrastive Learning

PROB: Probabilistic Objectness for Open World Object Detection

Open-World Social Event Classification

Unsupervised Recognition of Unknown Objects for Open-World Object Detection