EVALUATE DISSIMILARITY OF SAMPLES IN FEATURE SPACE FOR IMPROVING KPCA

Yong Xu,David Zhang,Jian Yang,Zhong Jin,Jingyu Yang
DOI: https://doi.org/10.1142/s0219622011004415
2011-01-01
Abstract:Since in the feature space the eigenvector is a linear combination of all the samples from the training sample set, the computational efficiency of KPCA-based feature extraction falls as the training sample set grows. In this paper, we propose a novel KPCA-based feature extraction method that assumes that an eigenvector can be expressed approximately as a linear combination of a subset of the training sample set ("nodes"). The new method selects maximally dissimilar samples as nodes. This allows the eigenvector to contain the maximum amount of information of the training sample set. By using the distance metric of training samples in the feature space to evaluate their dissimilarity, we devised a very simple and quite efficient algorithm to identify the nodes and to produce the sparse KPCA. The experimental result shows that the proposed method also obtains a high classification accuracy.
What problem does this paper attempt to address?