Abstract:ABSTRACTActive learning has been proven to be effective in reducing labeling efforts for supervised learning. However, existing active learning work has mainly focused on training models for a single domain. In practical applications, it is common to simultaneously train classifiers for multiple domains. For example, some merchant web sites (like Amazon.com) may need a set of classifiers to predict the sentiment polarity of product reviews collected from various domains (e.g., electronics, books, shoes). Though different domains have their own unique features, they may share some common latent features. If we apply active learning on each domain separately, some data instances selected from different domains may contain duplicate knowledge due to the common features. Therefore, how to choose the data from multiple domains to label is crucial to further reducing the human labeling efforts in multi-domain learning. In this paper, we propose a novel multi-domain active learning framework to jointly select data instances from all domains with duplicate information considered. In our solution, a shared subspace is first learned to represent common latent features of different domains. By considering the common and the domain-specific features together, the model loss reduction induced by each data instance can be decomposed into a common part and a domain-specific part. In this way, the duplicate information across domains can be encoded into the common part of model loss reduction and taken into account when querying. We compare our method with the state-of-the-art active learning approaches on several text classification tasks: sentiment classification, newsgroup classification and email spam filtering. The experiment results show that our method reduces the human labeling efforts by 33.2%, 42.9% and 68.7% on the three tasks, respectively.

Active Learning using Localized Generalization Error for Text Categorization

Aggressive Dimensionality Reduction With Reinforcement Local Feature Selection For Text Categorization

Collaborative Work with Linear Classifier and Extreme Learning Machine for Fast Text Categorization

Text Categorization Based on Regularization Extreme Learning Machine

Manifold Adaptive Experimental Design for Text Categorization

Active Learning Based on Transfer Learning Techniques for Text Classification

Localization-Aware Active Learning for Object Detection

A Grouped Structure-based Regularized Regression Model for Text Categorization

The Application of Active Query K-Means in Text Classification

Learning Distinctive Margin Toward Active Domain Adaptation

Learning Effective Features for Chinese Text Categorization

LSASGT:an Approach to Text Categorization Based on Latent Semantic Analysis and Spectral Graph Transducer

Fast text categorization based on collaborative work in the semantic and class spaces

A Generic Method for Fine-grained Category Discovery in Natural Language Texts

Exploiting Textual and Visual Features for Image Categorization

Language Independent Text Categorization.

Active Learning Based on Local Representation.

Multi-domain active learning for text classification.

Active Generalized Category Discovery

Weak Learning Algorithm For Multi-Label Multiclass Text Categorization

An Editor Labeling Model for Training Set Expansion in Web Categorization