Integrating Informativeness, Representativeness and Diversity in Pool-Based Sequential Active Learning for Regression

Ziang Liu,Dongrui Wu
DOI: https://doi.org/10.1109/ijcnn48605.2020.9206845
2020-01-01
Abstract:In many real-world machine learning applications, unlabeled samples are easy to obtain, but it is expensive and/or time-consuming to label them. Active learning is a common approach for reducing this data labeling effort. It optimally selects the best few samples to label, so that a better machine learning model can be trained from the same number of labeled samples. This paper considers active learning for regression (ALR) problems. Three essential criteria -- informativeness, representativeness, and diversity -- have been proposed for ALR. However, very few approaches in the literature have considered all three of them simultaneously. We propose three new ALR approaches, with different strategies for integrating the three criteria. Extensive experiments on 12 datasets in various domains demonstrated their effectiveness.
What problem does this paper attempt to address?