Improving Multi-label Recognition using Class Co-Occurrence Probabilities

Samyak Rawlekar,Shubhang Bhatnagar,Vishnuvardhan Pogunulu Srinivasulu,Narendra Ahuja

2024-09-20

Abstract:Multi-label Recognition (MLR) involves the identification of multiple objects within an image. To address the additional complexity of this problem, recent works have leveraged information from vision-language models (VLMs) trained on large text-images datasets for the task. These methods learn an independent classifier for each object (class), overlooking correlations in their occurrences. Such co-occurrences can be captured from the training data as conditional probabilities between a pair of classes. We propose a framework to extend the independent classifiers by incorporating the co-occurrence information for object pairs to improve the performance of independent classifiers. We use a Graph Convolutional Network (GCN) to enforce the conditional probabilities between classes, by refining the initial estimates derived from image and text sources obtained using VLMs. We validate our method on four MLR datasets, where our approach outperforms all state-of-the-art methods.

Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning,Multimedia,Image and Video Processing

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the complexity in multi - label recognition (MLR), especially when the training data is limited. Multi - label recognition tasks involve identifying multiple objects from an image, which is more challenging than single - label classification tasks. Specifically, an image may contain a large number of class combinations, resulting in an exponentially increasing amount of data required to learn these combinations. In addition, the layout of different objects in the image may also be different, which further increases the difficulty of recognition. To address these challenges, existing methods usually rely on vision - language models (VLMs) and are trained using large - scale text - image datasets. However, these methods mainly focus on learning independent classifiers for each object and ignore the co - occurrence relationships between objects. This co - occurrence relationship can be captured by the conditional probability in the training data, but existing methods fail to make full use of this. Therefore, this paper proposes a new framework aiming to improve the performance of independent classifiers by introducing co - occurrence information between class pairs. Specifically, the authors propose a two - stage method. First, VLMs are used to obtain initial classification results, and then these results are optimized by a graph convolutional network (GCN) using the conditional probability of class pairs. This method can not only improve the accuracy of classification but also effectively alleviate the over - fitting problem on small datasets. In conclusion, the main goal of this paper is to improve the performance of multi - label recognition tasks on small datasets by introducing co - occurrence information of class pairs.

Improving Multi-label Recognition using Class Co-Occurrence Probabilities

Dual Enhancement for Multi-Label Learning with Missing Labels

Multi-Label Classification with Label Graph Superimposing

Improved Multi-label Classification with Frequent Label-set Mining and Association

Multi-label classification by exploiting label correlations

Enhancing Label Correlation Feedback in Multi-Label Text Classification via Multi-Task Learning

Rethinking Modal-oriented Label Correlations for Multi-modal Multi-label Learning

Semantic and Correlation Disentangled Graph Convolutions for Multilabel Image Recognition.

A multi-label image classification method combining multi-stage image semantic information and label relevance

Multi-Label Image Recognition With Graph Convolutional Networks

Learning Graph Convolutional Networks for Multi-Label Recognition and Applications

Joint multi-label multi-instance learning for image classification

DualCoOp: Fast Adaptation to Multi-Label Recognition with Limited Annotations

Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

MLGN:A Multi-Label Guided Network for Improving Text Classification

GKGNet: Group K-Nearest Neighbor based Graph Convolutional Network for Multi-Label Image Recognition

Open-Vocabulary Multi-Label Classification via Multi-Modal Knowledge Transfer

Vision-Language Pseudo-Labels for Single-Positive Multi-Label Learning

Multi-Label Learning by Exploiting Label Correlations Locally

Robust Multi-Graph Multi-Label Learning With Dual-Granularity Labeling

PVLR: Prompt-driven Visual-Linguistic Representation Learning for Multi-Label Image Recognition