Abstract:Weakly Supervised Semantic Segmentation (WSSS) with only image-level labels reduces the annotation burden and has been rapidly developed in recent years. However, current mainstream methods only employ a single image's information to localize the target and do not account for the relationships across images. When faced with Remote Sensing (RS) images, limited to complex backgrounds and multiple categories, it is challenging to locate and differentiate between the categories of targets. As opposed to previous methods that mostly focused on single-image information, we propose CISM, a novel cross-image semantic mining WSSS framework. CISM explores cross-image semantics in multi-category RS scenes for the first time with two novel loss functions: the Common Semantic Mining (CSM) loss and the Non-common Semantic Contrastive (NSC) loss. In particular, prototype vectors and the Prototype Interactive Enhancement (PIE) module were employed to capture semantic similarity and differences across images. To overcome category confusions and closely related background interferences, we integrated the Single-Label Secondary Classification (SLSC) task and the corresponding single-label loss into our framework. Furthermore, a Multi-Category Sample Generation (MCSG) strategy was devised to balance the distribution of samples among various categories and drastically increase the diversity of images. The above designs facilitated the generation of more accurate and higher-granularity Class Activation Maps (CAMs) for each category of targets. Our approach is superior to the RS dataset based on extensive experiments and is the first WSSS framework to explore cross-image semantics in multi-category RS scenes and obtain cutting-edge state-of-the-art results on the iSAID dataset by only using image-level labels. Experiments on the PASCAL VOC2012 dataset also demonstrated the effectiveness and competitiveness of the algorithm, which pushes the mean Intersection-Over-Union (mIoU) to 67.3% and 68.5% on the validation and test sets of PASCAL VOC2012, respectively.

Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information

Looking Beyond Single Images for Weakly Supervised Semantic Segmentation Learning.

Coupling Global Context and Local Contents for Weakly-Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation Via Alternate Self-Dual Teaching.

Weakly Supervised Semantic Segmentation by Pixel-to-Prototype Contrast

Learning Effectively from Noisy Supervision for Weakly Supervised Semantic Segmentation.

Weakly Supervised Fine-Grained Semantic Segmentation Via Spatial Correlation-Guided Learning

Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach

Weakly-Supervised Semantic Segmentation with Visual Words Learning and Hybrid Pooling

Weak-to-Strong Consistency Learning for Semisupervised Image Segmentation.

Weakly-Supervised Dual Clustering for Image Semantic Segmentation

Improving Semi-Supervised Semantic Segmentation with Dual-Level Siamese Structure Network

Group-Wise Learning for Weakly Supervised Semantic Segmentation

Weakly Supervised Semantic Segmentation via Alternative Self-Dual Teaching

Weakly Supervised Semantic Segmentation in Aerial Imagery via Cross-Image Semantic Mining

L2A: Learning Affinity from Attention for Weakly Supervised Continual Semantic Segmentation

Weakly-Supervised Semantic Segmentation with Image-Level Labels: from Traditional Models to Foundation Models

MuSCLe: A Multi-Strategy Contrastive Learning Framework for Weakly Supervised Semantic Segmentation

Dual Branch Framework Using Positive and Negative Learning for Weakly Supervised Semantic Segmentation

Separate and Conquer: Decoupling Co-occurrence via Decomposition and Representation for Weakly Supervised Semantic Segmentation

SemPLeS: Semantic Prompt Learning for Weakly-Supervised Semantic Segmentation