Sample Weighting: an Inherent Approach for Outlier Suppressing Discriminant Analysis

Chuan-Xian Ren,Dao-Qing Dai,Xiaofei He,Hong Yan
DOI: https://doi.org/10.1109/tkde.2015.2448547
IF: 9.235
2015-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:As the data acquirement technologies develop rapidly, both the amount and types of data become larger and larger. However, noise and outliers usually attach to the data and then affect the real performance of leaning algorithms in data mining and pattern analysis. To address this problem, the importance of the sample itself in building the optimal subspace is explored, and then an importance-sampling-inspired method is proposed for outlier suppressing feature extraction. First, we assign each sample a weight, which is estimated by graph Laplacian, and then calculate the approximated mean for each subject. By highlighting the most subject-oriented samples, the weighted average and the scatter metrics can be measured with maximum margins and superior classification performance. The supervised information integrates local data structure with respective contributions to building the optimal subspace. The linear criterion can be extended to a nonlinear case by the kernel trick. A regularization framework is proposed to deal with the rank-deficient problem, which is usually induced by the small sample size of training set. Competitive performance of our algorithm has been validated by extensive experiments performed on the synthetic and benchmark data, including facial images and gene micro-array data.
What problem does this paper attempt to address?