A Sparse Reconstructive Evidential -Nearest Neighbor Classifier for High-dimensional Data
Chaoyu Gong,Qian Wang,Yang You,Zhi-gang Su,Pei-hong Wang
DOI: https://doi.org/10.1109/tkde.2022.3157346
IF: 9.235
2022-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:The Evidential K-Nearest Neighbor (EK-NN) classification rule provides a global treatment of uncertainty and imprecision in class labels, and has been widely used in pattern recognition. Nevertheless, EK-NN still suffers from the fixed presupposition of hyper-parameter K without prior knowledge, due to the different spatial distribution of neighbors of each pattern in Euclidean space. More concretely, neighbors of some patterns may provide confusing information and then derive wrong classification results. To address this issue, we propose a sparse reconstructive evidential K-NN (SEK-NN) classifier, appropriately determining an individual K for each pattern and mapping the correlations between patterns from Euclidean space to a sparse reconstructed space. To match with this sparse reconstructed space, SEK-NN supersedes the Euclidean distance by correlation coefficients to measure the dissimilarities between patterns. When handling high-dimensional data, a parallel version of SEK-NN is implemented under the Apache Spark to speed up the parameter estimation. We respectively test SEK-NN and parallel SEK-NN over 19 middle dimensional datasets, 1 middle volume and 4 high-dimensional datasets that are up to 100 thousand of dimensions. Experimental results show that SEK-NN has great prediction performance and parallel SEK-NN is able to appropriately tackle high-dimensional datasets.
computer science, information systems, artificial intelligence,engineering, electrical & electronic