Abstract:In this paper, a weakly supervised domain generalization (WSDG) method is proposed for real-world visual recognition tasks, in which we train classifiers by using Web data (e.g., Web images and Web videos) with noisy labels. In particular, two challenging problems need to be solved when learning robust classifiers, in which the first issue is to cope with the label noise of training Web data from the source domain, while the second issue is to enhance the generalization capability of learned classifiers to an arbitrary target domain. In order to handle the first problem, the training samples within each category are partitioned into clusters, where we use one bag to denote each cluster and instances to denote the samples in each cluster. Then, we identify a proportion of good training samples in each bag and train robust classifiers by using the good training samples, which leads to a multi-instance learning (MIL) problem. In order to handle the second problem, we assume that the training samples possibly form a set of hidden domains, with each hidden domain associated with a distinctive data distribution. Then, for each category and each hidden latent domain, we propose to learn one classifier by extending our MIL formulation, which leads to our WSDG approach. In the testing stage, our approach can obtain better generalization capability by effectively integrating multiple classifiers from different latent domains in each category. Moreover, our WSDG approach is further extended to utilize additional textual descriptions associated with Web data as privileged information (PI), although testing data do not have such PI. Extensive experiments on three benchmark data sets indicate that our newly proposed methods are effective for real-world visual recognition tasks by learning from Web data.

Exploiting Privileged Information From Web Data For Image Categorization

A Pca Based Automatic Image Categorization Approach Using Dominant Color Features

Exploiting Textual and Visual Features for Image Categorization

Exploiting Web Images for Multi-Output Classification: from Category to Subcategories

Image Classification by Cross-Media Active Learning with Privileged Information

Exploiting Multi-Context Analysis in Semantic Image Classification

Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging

Webly Supervised Learning with Category-level Semantic Information

Simple and Efficient Learning using Privileged Information

Refining Image Categorization by Exploiting Web Images and General Corpus.

Text-based image retrieval using progressive multi-instance learning

Image Classification Method Rationally Utilizing Spatial Information of the Image

Learning with Privileged Information for Multi-Label Classification

Domain-specific website recognition using hybrid vector space model

Visual Recognition by Learning from Web Data Via Weakly Supervised Domain Generalization.

Exploiting Web Images for Fine-Grained Visual Recognition by Eliminating Open-Set Noise and Utilizing Hard Examples

Extracting Privileged Information for Enhancing Classifier Learning.

Exploiting Probabilistic Topic Models to Improve Text Categorization under Class Imbalance

Webly-Supervised Fine-Grained Visual Categorization Via Deep Domain Adaptation.

Fine-grained Classification using Heterogeneous Web Data and Auxiliary Categories

Visual saliency coding for image categorization