Experimental Design with Multiple Kernels
Hanmo Wang,Liang Du,Lei Shi,Peng Zhou,Yuhua Qian,Yi-Dong Shen
DOI: https://doi.org/10.1109/icdm.2015.107
2015-01-01
Abstract:In classification tasks, labeled data is a necessity but sometimes difficult or expensive to obtain. On the contrary, unlabeled data is usually abundant. Recently, different active learning algorithms are proposed to alleviate this issue by selecting the most informative data points to label. One family of active learning methods comes from Optimum Experimental Design (OED) in statistics. Instead of selecting data points one by one iteratively, OED-based approaches select data in a one-shot manner, that is, a fixed-sized subset is selected from the unlabeled dataset for manually labeling. These methods usually use kernels to represent pair-wise similarities between different data points. It is well known that choosing optimal kernel types (e.g. Gaussian kernel) and kernel parameters (e.g. kernel width) is tricky, and a common way to resolve it is by Multiple Kernel Learning (MKL), i.e., to construct a few candidate kernels and merge them to form a consensus kernel. There would be different ways to combine multiple kernels, one of which, called the the globalised approach is to assign a weight to each candidate kernel. In practice different data points in the same candidate kernel may not have the same contribution in the consensus kernel, this requires assigning different weights to different data points in the same candidate kernel, leading to the localized approach. In this paper, we introduce MKL to OED-based active learning, specifically we propose globalised and localized multiple kernel active learning methods, respectively. Our experiments on six benchmark datasets demonstrate that the proposed methods have better performance than existing OED-based active learning methods.