Sample-Efficient Kernel Mean Estimator with Marginalized Corrupted Data

Xiaobo Xia,Shuo Shan,Mingming Gong,Nannan Wang,Fei Gao,Haikun Wei,Tongliang Liu
DOI: https://doi.org/10.1145/3534678.3539318
2022-01-01
Abstract:Estimating the kernel mean in a reproducing kernel Hilbert space is central to many kernel-based learning algorithms. Given a finite sample, an empirical average is used as a standard estimation of the target kernel mean. Prior works have shown that better estimators can be constructed by shrinkage methods. In this work, we propose to corrupt data examples with noise from known distributions and present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distributions. Theoretically, we justify that the marginalized kernel mean estimator introduces implicit regularization in kernel mean estimation. Empirically, on a variety of tasks, we show that the marginalized kernel mean estimator is sample-efficient and obtains much lower estimation errors than the existing estimators.
What problem does this paper attempt to address?